Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.cekura.ai/llms.txt

Use this file to discover all available pages before exploring further.

Setup steps and authentication are in the Overview. This page covers production-call observability.
A call is a production conversation — captured from your live voice agent. Once a call is in Cekura, you can run metrics on it, build test sets from it, and analyze trends across thousands of calls. Webhooks from Vapi / Retell / ElevenLabs / LiveKit / Pipecat ingest automatically; you can also send calls programmatically.

List calls

cekura calls list --project-id 742
cekura calls list --agent-id 123 --format json | jq '.[] | .id, .duration'
Common filters: project_id, agent_id, from_date, to_date, success, topic. See the API Reference for the full list.

Inspect a call

cekura calls get 7788

Send a call for evaluation

If your provider isn’t on the auto-ingestion list — or you want to ship a call from your own backend — send it explicitly.
cekura calls send --from-file call.json
Where call.json contains the agent ID, transcript, duration, and metadata.

Run metrics on a call

Score an already-ingested call against one or more metrics.
cekura calls evaluate 7788 --metric-ids 55,56

# Re-run a previous evaluation (e.g. after editing a metric prompt)
cekura calls rerun-evaluation 7788 --metric-ids 55

Generate scenarios from real calls

Found an interesting production call? Turn it into regression scenarios.
cekura calls create-scenarios 7788 --count 3
cekura calls create-scenarios-progress --progress-id <id>

Promote a call into a test set

cekura test-sets create-from-call-log \
  --call-log-id 7788 \
  --name "support-edge-cases"

Improve the agent’s prompt from real failures

Use the failure pattern across recent calls to suggest a prompt improvement.
cekura calls improve-prompt --agent-id 123 --lookback-days 7
cekura calls improve-prompt-progress --progress-id <id>

See also

Metrics

Define what you score production calls on.

Runs & Results

Same scoring engine — for simulated, not production, traffic.

Dashboards

Visualize call quality and metric trends over time.

API Reference

Full field reference for call logs and observation payloads.