RAGstack can be run locally or deployed to Google Cloud. It deploys resources for retrieval-augmented generation including open-source LLMs, a vector database, and a server + UI. It supports GPT4All, Falcon-7b, and Llama 2 models, and uses Qdrant, an open-source vector database. The server and UI handle PDF uploads, allowing users to chat over their PDFs using Qdrant and their chosen open-source LLM. The roadmap includes support for Llama-2-40b and deployment on AWS.
Key takeaways:
- RAGstack is a private ChatGPT alternative that can be hosted within a VPC and connected to an organization's knowledge base, supporting open-source LLMs like Llama 2, Falcon, and GPT4All.
- The RAG (Retrieval Augmented Generation) technique augments the capabilities of a large language model by retrieving information from other systems and inserting them into the LLM’s context window via a prompt.
- RAGstack can be run locally or deployed to Google Cloud, and it includes resources for retrieval-augmented generation such as open-source LLM, vector database, and a server with UI.
- The roadmap for RAGstack includes support for Llama-2-40b and deployment on AWS, in addition to the already supported GPT4all, Falcon-7b, and deployment on GCP.