Base URL: https://api.zeroeval.com
All requests require a Bearer token:
Authorization: Bearer YOUR_ZEROEVAL_API_KEY
Completion Feedback
POST /v1/prompts/{prompt_slug}/completions/{completion_id}/feedback
Submit structured feedback for a specific LLM completion. This feedback powers prompt optimization.
Request body:
| Field | Type | Required | Description |
|---|
thumbs_up | bool | Yes | Positive or negative feedback |
reason | string | No | Explanation of the feedback |
expected_output | string | No | What the output should have been |
metadata | object | No | Additional metadata |
judge_id | string | No | Judge automation ID |
expected_score | float | No | Expected score (for scored judges) |
score_direction | string | No | "too_high" or "too_low" |
criteria_feedback | object | No | Per-criterion feedback |
curl -X POST https://api.zeroeval.com/v1/prompts/support-bot/completions/550e8400-.../feedback \
-H "Authorization: Bearer $ZEROEVAL_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"thumbs_up": false,
"reason": "Response was too vague",
"expected_output": "Should provide specific steps"
}'
Response: 200
{
"id": "fb123e45-...",
"completion_id": "550e8400-...",
"prompt_id": "a1b2c3d4-...",
"thumbs_up": false,
"reason": "Response was too vague",
"expected_output": "Should provide specific steps",
"created_at": "2025-01-15T10:30:00Z"
}
If feedback already exists for the same completion from the same user, it will
be updated with the new values.
Unified Entity Feedback
GET /projects/{project_id}/feedback/{entity_type}/{entity_id}
Retrieve all feedback — human reviews and judge evaluations — for a span, trace, or session in a single response.
| Path Parameter | Description |
|---|
project_id | UUID of the project |
entity_type | span, trace, or session |
entity_id | UUID of the entity |
Response: 200
{
"entity_type": "span",
"entity_id": "550e8400-...",
"summary": {
"total": 3,
"human_feedback_count": 1,
"judge_evaluation_count": 2
},
"items": [
{
"kind": "human_feedback",
"id": "fb123e45-...",
"span_id": "550e8400-...",
"thumbs_up": true,
"reason": "Clear and helpful",
"created_at": "2025-01-15T10:30:00Z",
"created_by": {
"id": "user-123",
"email": "[email protected]",
"name": "Alice"
},
"source_type": "human"
},
{
"kind": "judge_evaluation",
"id": "je456f78-...",
"span_id": "550e8400-...",
"automation_id": "judge-abc-...",
"judge_name": "Helpfulness",
"evaluation_result": true,
"evaluation_reason": "Response directly answers the question with clear steps.",
"confidence_score": 0.92,
"model_used": "gemini-3-flash-preview",
"evaluation_duration_ms": 1200,
"score": 8.5,
"evaluation_type": "scored",
"score_min": 0,
"score_max": 10,
"pass_threshold": 7.0,
"criteria_scores": {
"clarity": { "score": 9, "reason": "Well-structured response" },
"accuracy": { "score": 8, "reason": "Correct information provided" }
},
"created_at": "2025-01-15T10:31:00Z"
}
]
}
Response fields
summary — aggregate counts for fast display:
| Field | Type | Description |
|---|
total | int | Total feedback items |
human_feedback_count | int | Number of human review items |
judge_evaluation_count | int | Number of judge evaluation items |
items[] — each item has a kind field (human_feedback or judge_evaluation) that determines which fields are present:
| Field (human_feedback) | Type | Description |
|---|
thumbs_up | bool | Positive or negative |
reason | string | Reviewer’s explanation |
expected_output | string | Corrected output (if provided) |
created_by | object | User who submitted the feedback |
source_type | string | "human" or "judge" |
| Field (judge_evaluation) | Type | Description |
|---|
automation_id | string | Judge automation UUID |
judge_name | string | Display name of the judge |
evaluation_result | bool | Whether the output passed |
evaluation_reason | string | Judge’s reasoning |
confidence_score | float | Judge confidence (0-1) |
model_used | string | Model used for the evaluation |
score | float | Score value (scored evaluations only) |
evaluation_type | string | "binary" or "scored" |
score_min / score_max | float | Score range (scored evaluations only) |
pass_threshold | float | Threshold for pass/fail |
criteria_scores | object | Per-criterion scores and reasons |
For traces and sessions, feedback is aggregated from all descendant spans.