Retrieval Augmented Generation (RAG): What, Why and How?

Retrieval Augmented Generation (RAG) is an architecture developed by Meta in 2020 to enhance the performance of large language models (LLMs) by providing relevant information along with the task details. RAG involves three main stages: data preparation, retrieval, and generation. The data preparation stage involves identifying data sources, extracting data, cleaning it, and storing it in a database. The retrieval stage involves retrieving relevant data from the database based on the task at hand. The generation stage involves generating the output using the retrieved data and the task at hand.

RAG can be used to improve the performance of LLMs on tasks like summarization, translation, etc., that may not be possible to fine-tune on. Techniques to improve RAG performance include hybrid search, summaries, overlapping chunks, fine-tuned embedding models, metadata, re-ranking, and avoiding the "lost in the middle" problem. RAG comes out of the box with LLMStack, which takes care of chunking the data, generating embeddings, and storing them in the vector store.

Key takeaways:

Retrieval Augmented Generation (RAG) is an architecture that improves the performance of Language Models (LLMs) by passing relevant information to the model along with the question/task details.
RAG involves three main stages - data preparation, retrieval, and generation. The quality of the output depends on the quality of the data and the retrieval strategy.
There are several techniques to improve RAG performance in production, including hybrid search, summaries, overlapping chunks, fine-tuned embedding models, metadata, re-ranking, and addressing the 'lost in the middle' problem.
RAG pipeline comes out of the box with LLMStack, which takes care of chunking the data, generating embeddings, storing them in the vector store, retrieving the relevant data, and passing it to the LLM for generation.

Retrieval Augmented Generation (RAG): What, Why and How? | LLMStack

Key takeaways:

Comments (0)

Newsletter