AI Research
Retrieval-Augmented Generation at Scale: Lessons Learned
We process millions of documents across our products. Here's what we learned about chunking strategies, embedding models, and hybrid retrieval that the benchmarks don't tell you.
Engineering Team· neww.aiMarch 10, 2026 14 min read
RAG is harder than the demos suggest. At our scale — millions of documents across thirty products — the small choices compound into huge quality differences.
This post shares everything the benchmarks won't tell you: chunking that respects semantic boundaries, hybrid retrieval that fuses BM25 with embeddings, re-rankers that actually earn their latency cost, and the eval harness that keeps us honest.
Try neww.ai
30 AI products in one platform. Free to start.