The team found that LLMs work well for text classification, especially when there isn't enough data to train a task-specific, supervised learning model. They used OpenAI’s function calling to get structured data back from the model reliably, which is necessary for text classification. In their prototype, they vectorised the text of each BBC Tiny Happy People activity and stored it in Pinecone, a managed vector database. They also stored the predicted areas of learning as metadata so that they could use them to filter the relevant category of activities before running a vector search.
Key takeaways:
- Nesta’s Discovery Hub is exploring the use of generative AI, specifically LLMs, for early-years education and how it can be used for social good.
- LLMs like GPT-4 can be used for text classification tasks, with the ability to use zero-shot or few-shot prompting to improve performance.
- OpenAI's function calling can be used to standardize the output format from LLMs, making it easier to parse for downstream tasks.
- The team at Nesta's Discovery Hub used GPT-4 and function calling to classify activities from the BBC Tiny Happy People website into the seven Areas of Learning described in the Early Years Foundation Stages (EYFS) statutory framework.