Setup steps and authentication are in the Overview. This page covers metric authoring and management.
- Predefined — platform-managed metrics like sentiment, interruption count, latency.
- LLM judge — you write a prompt; an LLM scores the transcript.
- Custom code — Python that runs against the call payload.
Browse predefined metrics
These are read-only and shared across the platform.- CLI
- SDK
Create a metric
- CLI
- SDK
Prepare Apply:
metric.json:List, update, delete
- CLI
- SDK
Bulk operations
Manage many metrics in one call.- CLI
- SDK
Generate metrics with AI
Stuck on what to measure? Cekura can propose metrics from your agent’s scenarios.- CLI
- SDK
Critical metric scenarios & reviews
Two advanced workflows:- Critical metric scenarios — scenarios flagged because their metric output drifted. Useful for triage.
- Metric reviews (Labs pipeline) — kick off feedback processing on metric judgments and watch progress.
- CLI
- SDK
See also
Runs & Results
Where metric scores show up after a run.
Calls
Run metrics against production calls, not just simulations.
Metric concepts
LLM judge, Python, rubric, sampling — when to use what.
API Reference
Full field reference for metric payloads.