← Back
Head-to-head
20 cases. Same prompt. Same retrieval. Rubric-scored every night. When we lose, we publish it. When we win, you get the data.
20 cases. Same prompt. Same retrieval. Rubric-scored every night. When we lose, we publish it. When we win, you get the data.