RAG Tuner.

An internal tool that finds the optimal chunk size, overlap, and reranker for any corpus — one click, sweep done in minutes.

Role: R&D · Stellar Labs
Client: Stellar Labs
Year: 2024
Duration: 5 weeks

// previewRAG Tuner

median recall@5 · tuned0.91

recall vs. default+23%

teams using4

median sweep time12 min

/ 01 — The problem

What was broken.

Every team building a RAG feature kept asking the same question: 'what chunk size should I use?' The honest answer was 'depends on your data' — but nobody had time to find out.

/ 02 — The approach

How I tackled it.

Built a tool that takes a corpus + a set of eval queries, sweeps across chunk sizes / overlap / rerankers, and outputs a ranked recommendation. Visualized recall@k so engineers could see the trade-off, not just take a number on faith.

/ 03 — The outcome

What shipped.

Used by 4 internal teams. Average recall@5 jumped from 0.74 (default config) to 0.91 (tuned).

/ Like what you see?

Let’s talk.

→Get in touch