The article further provides a step-by-step guide on how to get started with the LLM Scraper. This includes installing the required dependencies from npm, setting up an OpenAI API key, and creating a new browser instance to attach the LLMScraper. An example is also provided to illustrate how to extract top stories from HackerNews using the LLM Scraper. The article concludes by inviting contributions from the community to the open-source project.
Key takeaways:
- LLM Scraper is a TypeScript library that converts webpages into structured data using LLMs. It is based on the Playwright framework and supports three operating modes: html, text, and image.
- The library offers full type-safety with TypeScript and uses OpenAI chat models. Schemas are defined with Zod.
- Getting started with LLM Scraper involves installing required dependencies from npm, setting an OpenAI API key in your environment variables, and optionally creating a new browser instance and attaching LLMScraper to it.
- The project is open-source and welcomes contributions from the community in the form of bug reports or improvements via issues or pull requests.