Introduction

Autotune is a different approach to the traditional evals experience. Instead of setting up complex eval pipelines, we simply ingest your production traces and let you replay them with different models and generate optimized prompts based on your feedback. Some of the key features include:

Content-based versioning: Each unique prompt content gets its own version via SHA-256 hashing
Variable templating: Use {{variable}} syntax for dynamic content
Automatic tracking: All interactions are traced for analysis
One-click model deployments: Models update instantly without code changes

How it works

Instrument your code

Replace hardcoded prompts with ze.prompt() calls

Every change creates a version

Each time you modify your prompt content, a new version is automatically created and tracked

Collect performance data

ZeroEval automatically tracks all LLM interactions and their outcomes

Tune and evaluate

Use the UI to run experiments, vote on outputs, and identify the best prompt/model combinations

One-click model deployments

Winning configurations are automatically deployed to your application without code changes

Setup Guide

Learn how to integrate ze.prompt() into your codebase

Tuning Guide

Run experiments and deploy winning combinations

Tracing

Autotune

Calibrated Judges

Experiments

LLM Gateway

How it works

Setup Guide

Tuning Guide

Tracing

Autotune

Calibrated Judges

Experiments

LLM Gateway

​How it works

Setup Guide

Tuning Guide

How it works