Sign up to save tools and stay up to date with the latest in AI
bg
bg
1

With LLMs, Enterprise is Different: Data

Aug 19, 2023 - colinharman.substack.com
The article discusses the unique challenges of developing Large Language Models (LLMs) for enterprise applications, as opposed to consumer-focused startups. The author highlights that enterprise data is often more closed-domain, larger in size, more archival, and has more complex access control. The article also emphasizes that enterprise data often involves more modalities and is more regularly structured than consumer data. The author suggests strategies for dealing with these differences, such as assessing whether a domain is open or closed, reducing noise in retrieval systems, and implementing proper access control.

The author also suggests that enterprise LLM projects should focus on problems where the supporting data is well-structured and can be filtered down to a unit level. They also recommend treating use cases involving different modalities as separate projects and solving them individually before combining them. The author concludes by inviting feedback on potential future topics for discussion, including use cases, dealing with too much or closed-domain data, security and compliance, requirements, patterns for retrieval and generation, data slicing, access control, architecture and infrastructure, LLM strategy, safety, and team dynamics.

Key takeaways:

  • Developing solutions involving Large Language Models (LLMs) for mature businesses is fundamentally different than for startups, with unique challenges and risks. Most advice on LLM projects comes from a startup-to-consumer perspective, which may not be suitable for enterprise environments.
  • Enterprise data differs from startup/consumer data in several ways, including being more closed-domain, generally larger in size, more focused on population level, having more modalities, being more structured, and having more complex access control. These differences can impact the effectiveness of LLM applications.
  • Enterprises often have more data than individual consumers, which can increase the likelihood of unhelpful records being considered relevant. This can be mitigated by limiting the size of data your application operates over and focusing on a problem where the supporting data is well structured and able to be filtered down to a unit level.
  • Understanding the structure and access control of your data is crucial in enterprise LLM projects. The more you know about the data, the simpler and more reliable you can build a system to work with it. However, implementing access control properly can be complex and requires a lot of engineering.
View Full Article

Comments (0)

Be the first to comment!