All posts
AI Research

Retrieval-Augmented Generation at Scale: Lessons Learned

We process millions of documents across our products. Here's what we learned about chunking strategies, embedding models, and hybrid retrieval that the benchmarks don't tell you.

Engineering Team· neww.aiMarch 10, 2026 14 min read

RAG is harder than the demos suggest. At our scale — millions of documents across thirty products — the small choices compound into huge quality differences.

This post shares everything the benchmarks won't tell you: chunking that respects semantic boundaries, hybrid retrieval that fuses BM25 with embeddings, re-rankers that actually earn their latency cost, and the eval harness that keeps us honest.

Try neww.ai
30 AI products in one platform. Free to start.
Get started