Submitting Feedback

Overview

When calibrating judges, you can submit feedback programmatically using the SDK. This is useful for:

Bulk feedback submission from automated pipelines
Integration with custom review workflows
Syncing feedback from external labeling tools

Important: Using the Correct IDs

Judge evaluations involve two related spans:

ID	Description
Source Span ID	The original LLM call that was evaluated
Judge Call Span ID	The span created when the judge ran its evaluation

When submitting feedback, always include the judge_id parameter to ensure feedback is correctly associated with the judge evaluation.

Python SDK

From the UI (Recommended)

The easiest way to get the correct IDs is from the Judge Evaluation modal:

Open a judge evaluation in the dashboard
Expand the “SDK Integration” section
Click “Copy” to copy the pre-filled Python code
Paste and customize the generated code

Manual Submission

from zeroeval import ZeroEval

client = ZeroEval()

# Submit feedback for a judge evaluation
client.send_feedback(
    prompt_slug="your-judge-task-slug",  # The task/prompt associated with the judge
    completion_id="span-id-here",         # The span ID from the evaluation
    thumbs_up=True,                        # True = correct, False = incorrect
    reason="Optional explanation",
    judge_id="automation-id-here",         # Required for judge feedback
)

Parameters

Parameter	Type	Required	Description
`prompt_slug`	str	Yes	The task slug associated with the judge
`completion_id`	str	Yes	The span ID being evaluated
`thumbs_up`	bool	Yes	`True` if judge was correct, `False` if wrong
`reason`	str	No	Explanation of the feedback
`judge_id`	str	Yes*	The judge automation ID (*required for judge feedback)
`expected_score`	float	No	For scored judges: the expected score value
`score_direction`	str	No	For scored judges: `"too_high"` or `"too_low"`

expected_score and score_direction are only valid for scored judges (judges with evaluation_type: "scored"). The API will return a 400 error if these fields are provided for binary judges.

Score-Based Feedback

For judges using scored rubrics (not binary pass/fail), you can provide additional feedback about the expected score:

from zeroeval import ZeroEval

client = ZeroEval()

# Submit feedback for a scored judge evaluation
client.send_feedback(
    prompt_slug="quality-scorer",
    completion_id="span-id-here",
    thumbs_up=False,                       # The judge was incorrect
    judge_id="automation-id-here",
    expected_score=3.5,                    # What the score should have been
    score_direction="too_high",            # The judge scored too high
    reason="Score should have been lower due to grammar issues",
)

REST API

Binary Judge Feedback

curl -X POST "https://api.zeroeval.com/v1/prompts/{task_slug}/completions/{span_id}/feedback" \
  -H "Authorization: Bearer $ZEROEVAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "thumbs_up": true,
    "reason": "Judge correctly identified the issue",
    "judge_id": "automation-uuid-here"
  }'

Scored Judge Feedback

For scored judges, include expected_score and score_direction:

curl -X POST "https://api.zeroeval.com/v1/prompts/{task_slug}/completions/{span_id}/feedback" \
  -H "Authorization: Bearer $ZEROEVAL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "thumbs_up": false,
    "reason": "Score should have been lower",
    "judge_id": "automation-uuid-here",
    "expected_score": 3.5,
    "score_direction": "too_high"
  }'

Finding Your IDs

ID	Where to Find It
Task Slug	In the judge settings, or the URL when editing the judge’s prompt
Span ID	In the evaluation modal, or via `get_judge_evaluations()` response
Judge ID	In the URL when viewing a judge (`/judges/{judge_id}`)

Bulk Feedback Submission

For submitting feedback on multiple evaluations, you can iterate through evaluations:

from zeroeval import ZeroEval

client = ZeroEval()

# Get evaluations to review
evaluations = client.get_judge_evaluations(
    project_id="your-project-id",
    judge_id="your-judge-id",
    limit=100,
)

# Submit feedback for each
for eval in evaluations["evaluations"]:
    # Your logic to determine if the evaluation was correct
    is_correct = your_review_logic(eval)
    
    client.send_feedback(
        prompt_slug="your-judge-task-slug",
        completion_id=eval["span_id"],
        thumbs_up=is_correct,
        reason="Automated review",
        judge_id="your-judge-id",
    )

Pulling Evaluations - Retrieve judge evaluations programmatically
Judge Setup - Configure and deploy judges

Tracing

Prompts

Judges

Overview

Important: Using the Correct IDs

Python SDK

From the UI (Recommended)

Manual Submission

Parameters

Score-Based Feedback

REST API

Binary Judge Feedback

Scored Judge Feedback

Finding Your IDs

Bulk Feedback Submission

Tracing

Prompts

Judges

​Overview

​Important: Using the Correct IDs

​Python SDK

​From the UI (Recommended)

​Manual Submission

​Parameters

​Score-Based Feedback

​REST API

​Binary Judge Feedback

​Scored Judge Feedback

​Finding Your IDs

​Bulk Feedback Submission

​Related

Overview

Important: Using the Correct IDs

Python SDK

From the UI (Recommended)

Manual Submission

Parameters

Score-Based Feedback

REST API

Binary Judge Feedback

Scored Judge Feedback

Finding Your IDs

Bulk Feedback Submission

Related