Future updates for the TypeScript version, already available via the API client, will include Semantic Chunker, SDPM Chunker, Late Chunker, Slumber Chunker, Embeddings Refinery, and Overlap Refinery. These features aim to enhance chunk quality, reduce token usage, and improve context preservation. Chonkie is free, open-source, and licensed under MIT, with the developers welcoming feedback, ideas, and contributions from the community.
Key takeaways:
- Chonkie is an open-source library for advanced chunking and embedding of text and code, now available in TypeScript.
- It offers various native chunkers like Code Chunker, Recursive Chunker, Token Chunker, and Sentence Chunker, all supporting custom tokenizers and delimiters.
- Upcoming features include Semantic Chunker, SDPM Chunker, Late Chunker, Slumber Chunker, Embeddings Refinery, and Overlap Refinery.
- Chonkie is free, open-source, and MIT licensed, with a focus on improving text retrieval and performance in AI projects.