Launch HN: CamelQA (YC W24) – AI that tests mobile apps

CamelQA is developing an AI agent that can automate mobile devices using computer vision, with its initial use case being mobile app QA. The system translates natural language test cases into tests that run on real iOS and Android devices in a device farm, aiming to eliminate the need for engineers to maintain fragile scripts. The technology combines accessibility element data with a custom vision-only RCNN object detection model and Google Siglip for UI element classification, enabling it to detect elements even without associated accessibility elements. The agent uses Appium to interface with the device and GPT-4V and GPT-3.5 for high-level reasoning and execution of actions.

The vision-based system does not require access to source code and works across all app types. A demo has been built where users can control Wikipedia on a simulated iPhone using the model. The company was founded by former corporate employees who were motivated by the challenges of constant app testing and the occurrence of a bug that caused an app to crash. They are eager to receive feedback on their innovative solution.

Key takeaways:

camelQA is building an AI agent that can automate mobile devices using computer vision, primarily for mobile app QA.
The AI uses a combination of accessibility element data and a custom vision-only RCNN object detection model paired with Google Siglip for UI element classification.
camelQA's system doesn't require access to source code and works across all app types, including SwiftUI and UIKit, React Native, and Flutter.
The founders of camelQA left their corporate jobs to build in the AI space after experiencing the challenges of app testing and shipping a bug in one of their apps.

Launch HN: CamelQA (YC W24) – AI that tests mobile apps

Key takeaways:

Comments (0)

Newsletter