The model is ideal for enterprise use, including customer service and back-office tasks, due to its high level of responsibility. Despite scoring 0% on traditional accuracy tests like VQA-V2, TextVQA, and ChartQA, GOODY-2 excels in the Performance and Reliability Under Diverse Environments (PRUDE-QA) benchmark, outperforming the competition by over 70%.
Key takeaways:
- GOODY-2 is a new AI model that prioritizes ethical adherence and safety, refusing to answer any question that could be potentially controversial or problematic.
- GOODY-2 is ideal for enterprise use in customer service, paralegal assistance, and back-office tasks due to its high level of responsibility.
- GOODY-2 outperforms other models in the PRUDE-QA benchmark, scoring 99.8% compared to GPT-4's 28.3%.
- Despite its safety and responsibility, GOODY-2 scores 0% in other benchmarks such as VQA-V2, TextVQA, and ChartQA, where GPT-4 scores over 77%.