The app is powered by OpenAI's GPT-4.1 model and uses Xcode's UI testing harness to interact with iOS apps without requiring a jailbreak. It communicates with a UI test via a TCP Server to trigger prompts. While it offers innovative features, it has limitations such as imperfect keyboard input, issues with capturing view hierarchies during animations, and not waiting for long-running tasks to complete. The app is experimental, and users are advised to run it in an isolated environment as app contents are sent to OpenAI's API.
Key takeaways:
- PhoneAgent is an iPhone app that uses OpenAI models to perform tasks across multiple apps, similar to a human user.
- The app can perform actions like tapping, swiping, scrolling, typing, and opening apps using the accessibility tree of iOS apps.
- It features an optional Always On mode that listens for prompts starting with a wake word, even when the app is in the background.
- The project uses Xcode's UI testing harness to interact with apps and the system without requiring a jailbreak.