TL;DR
- Pull (or create) a dataset
- Write a task – any Python function that maps a row ➜ model output
- Optionally write evaluators – functions that score
(row, output)
- Wrap them in
ze.Experiment
and call.run()
Run A/B tests on your models.
(row, output)
ze.Experiment
and call .run()