The system uses tools like Kadoa to fetch job data from company pages, but it can also work with other traditional scraping methods. The relevance scoring works best with certain models, and users can switch between models depending on their needs. However, users are warned that running the script continuously can result in high API usage. To run the app, users need an OpenAI API or Anthropic Claude key, Python 3.7 or higher and pipenv for the Flask server, and Node.js and npm for the Next.js client. The system currently focuses only on job data, but there are plans to extend it to other types of data like events and products.
Key takeaways:
- Datalens is a personal experiment that uses LLMs to rank unstructured job data based on user-defined criteria, providing a more flexible and personalized job search experience.
- The system allows users to add any job data source they like, and provides examples of how to scrape career pages for job data.
- The relevance scoring works best with certain models, but the default model is chosen for cost reasons. Users are warned that running the script continuously can result in high API usage.
- Improvements to the system could include streaming for faster results, extensibility to other types of data, switching to SQLite for storage, and fine-tuning models for better and cheaper results.