When to use
Use a judge when you want consistent, scalable evaluation of:- Hallucinations, safety/policy violations
- Response quality (helpfulness, tone, structure)
- Latency, cost, and error patterns tied to specific criteria
Continuously evaluate your production traffic with judges that learn over time