Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

AI for Data Journalism: demonstrating what we can do with this stuff right now

Apr 22, 2024 - simonwillison.net
On 17th April 2024, Simon Willison gave a talk at the Story Discovery at Scale data journalism conference, hosted at Stanford by Big Local News. He discussed the current uses of Large Language Models (LLMs), showcasing various demos of tools and projects. These included generating Haikus from images, pasting data from Google Sheets into Datasette Cloud, AI-assisted SQL queries, scraping data, enriching data in a table, and using command-line tools for working with LLMs. He also highlighted new developments in LLMs, such as Google Gemini Pro 1.5 and Anthropic's Claude 3 Opus and Haiku models.

Willison also demonstrated the use of LLMs for structured data extraction and code interpretation. He showed how unstructured text or images can be converted into structured data using the datasette-extract plugin. He also showcased the capabilities of ChatGPT's Code Interpreter mode, where the model can generate and execute Python code as part of an ongoing conversation. Despite some challenges and limitations, Willison's talk illustrated the potential of LLMs in data journalism and other fields.

Key takeaways:

  • The author gave a talk at the Story Discovery at Scale data journalism conference, discussing the use of Large Language Models (LLMs) in various applications, including data extraction, AI-assisted SQL queries, and more.
  • Several live demos were conducted during the talk, demonstrating the use of different tools and models such as Claude 3 Haiku, Datasette Cloud, and Gemini Pro 1.5.
  • LLMs can be used for a variety of tasks, including generating haikus from images, extracting structured data from unstructured text or images, and even executing Python code as part of an ongoing conversation.
  • However, the author also highlighted the potential risks and challenges of using LLMs, such as the possibility of the model hallucinating extra details in the output, and the importance of considering the model's output length limit.
View Full Article

Comments (0)

Be the first to comment!