Skip to main content
Your agent produces thousands of outputs, but without feedback you can’t tell which ones are good. Feedback closes the loop — it connects real-world quality judgments to the traces, spans, and completions your agent generates. ZeroEval supports two kinds of feedback:
  • Human feedback — thumbs-up/down, star ratings, corrections, and expected outputs submitted by users or reviewers
  • AI feedback — automated evaluations from calibrated judges that score outputs against criteria you define
Both feed into the same system. Feedback attached to completions powers prompt optimization. You can also retrieve unified feedback — combining human reviews and judge evaluations — for any span, trace, or session via the Feedback API.

How feedback flows

1

Agent produces output

Your agent runs and ZeroEval captures the full trace — inputs, outputs, model, prompt version.
2

Feedback is attached

Humans review outputs in the dashboard or your app submits feedback programmatically. Judges evaluate outputs automatically based on your criteria.
3

Quality becomes measurable

Feedback appears on spans, traces, and completions in the console. Filter by thumbs-up rate, judge scores, or tags to find patterns.
4

Improvements are driven by data

Use feedback to optimize prompts, compare models, calibrate judges, and catch regressions before users do.

Get started