The author also introduces the SuperSonic framework, which uses a Schema Mapper, LLM, and a Semantic Layer to process and answer complex queries. They discuss the evolution of their OLAP architecture, including the decision to replace ClickHouse with Apache Doris and split flat tables into metric and dimension tables. The author also mentions other useful functionalities of Apache Doris, such as Materialized View, Flink-Doris-Connector, and Compaction. Future plans include testing the newly released Storage-Compute Separation and Cross-Cluster Replication of Doris.
Key takeaways:
- The team replaced ClickHouse with Apache Doris as an OLAP engine for their data management system and used Large Language Models (LLM) to transform natural language questions into SQL statements, improving the ease of SQL writing.
- They addressed several issues with the LLM, including its lack of understanding of data jargon, slow inference, lack of niche knowledge, and need for more diverse information, by introducing a semantic layer, creating LLM parsing rules, adding a Schema Mapper, and using plugins.
- The team developed the SuperSonic framework, which uses a Schema Mapper, LLM, and a Semantic Layer to process and answer complex queries. They also optimized their OLAP architecture by streamlining links and splitting flat tables into metric and dimension tables.
- Future plans include testing the newly released Storage-Compute Separation and Cross-Cluster Replication of Doris to reduce costs and increase service availability, and they are open to ideas and inputs about the SuperSonic framework and the Apache Doris project.