How To Solve LLM Hallucinations

Startup Lamini has developed a new methodology to reduce hallucinations in large language models (LLMs) by 95%. The company's approach, called Memory Tuning, involves embedding specific data into models, allowing them to recall exact data and significantly reducing the occurrence of hallucinations. The process involves tuning each expert in a Mixture of Memory Experts (MoME) to curated data at a rate 100 times higher than fine-tuning, enabling the model to hold raw random-string style information as part of its own dataset. Lamini's method has reportedly had a near-zero effect on the rest of the model, maintaining its general reasoning capabilities.

Lamini's Memory Tuning/MoME capability has already been implemented by several customers, including a Fortune 500 company, which has experienced a tenfold reduction in hallucinations in text-to-SQL code generation. The company's approach could potentially redefine the future of machine learning compute profiles, similar to the impact of transformers on convolutional neural networks. However, further research is needed to understand the computational change in the inference and its potential impact on the silicon industry.

Key takeaways:

Lamini, a startup led by CEO Sharon Zhou and CTO Greg Diamos, has developed a new methodology to reduce hallucinations in large language models (LLMs) by 95%.
The company's approach, called Memory Tuning, involves embedding specific data into models, creating a 'Mixture of Memory Experts' (MoME) that can recall exact data.
This method could be particularly useful for models dealing with specific knowledge domains, such as a company's product portfolio or support documentation.
While the computational requirements for Memory Tuning are lower than regular fine-tuning, it is unclear how this method might affect the computational change in inference and whether it could drive a shift in computer architecture.

How To Solve LLM Hallucinations

Key takeaways:

Comments (0)

Newsletter