Interview

Chonkie: Open-source document chunking for LLM RAG pipelines — 180K downloads, 200 projects, and LlamaIndex dependency

Jun 11, 2025 with Trey

Key Points

Chonkie, a two-person YC S25 startup, has reached 180,000 downloads and secured 200+ dependent projects by solving a critical problem: retrieval accuracy on multi-document corpuses collapses without structured chunking.
An eval showed o3 achieved 75% retrieval accuracy on dense documents without chunking, jumping to 100% after processing through Chonkie.
LlamaIndex uses Chonkie as a core dependency, and at least 10 to 12 current YC companies are already adopting it as the ingestion layer for LLM RAG pipelines.

Chonkie: Open-source document chunking for LLM RAG pipelines — 180K downloads, 200 projects, and LlamaIndex dependency

Summary

Chonkie, a two-person YC S25 company founded by Trey and a childhood friend, builds open-source document chunking infrastructure for LLM retrieval-augmented generation pipelines. Dumping thousands of PDFs into a large context window works for a single document but breaks down at scale. Frontier models struggle with retrieval accuracy on dense, multi-format documents without structured chunking.

“We take really complex documents, split them up into meaningful pieces such that one piece is one idea, and then send your LLM only the data it needs to answer questions. We ran this eval after the price drop on O3: classic literature to O3 got a retrieval accuracy of 75%. We chunk the data through Chonkie then asked O3 the same thing — always 100%. We've got over 180,000 downloads and over 200 projects using us. We're a core dependency on projects like LlamaIndex.”
— Trey

Trey ran an eval the day before Demo Day, after OpenAI's o3 price drop. He fed classic literature into o3 and asked pointed retrieval questions. Without chunking, o3 hit 75% retrieval accuracy. After running the same documents through Chonkie, accuracy reached 100%.

The product sits at the ingestion layer of an LLM stack. Developers feed documents in, get embeddings out, and can route those into their own vector database or let Chonkie wrap around one. Integration takes two to five lines of code. For static corpora, chunking runs asynchronously at ingest. For live use cases like code generation, it runs in real time.

Chonkie started as a side project in February 2025 before entering YC. It has reached 180,000 downloads and supports 200 projects depending on it. The project is a core dependency of LlamaIndex. Ten to twelve companies from the current YC batch are already using it. Trey expected to close a funding round by Friday after Demo Day.

Read full transcript →

Every deal, every interview. 5 minutes.

TBPN Digest delivers summaries of the latest fundraises, interviews and tech news from TBPN, every weekday.

You might also like...

Chroma co-founder Jeff Huber: long-context windows won't kill RAG — they'll finally prove why retrieval matters

Apr 8, 2025

Emergent hits 1M users and 1.5M apps built on its all-in-one no-code platform for consumers

Sep 18, 2025

GitHub COO Kyle Daigle on Copilot scale, platform openness, and why enterprise code fine-tuning rarely moves the needle

Oct 28, 2025