Skip to main content
PATCH
/
test_framework
/
v1
/
metrics
/
{id}
Update a metric (partial)
curl --request PATCH \
  --url https://api.cekura.ai/test_framework/v1/metrics/{id}/ \
  --header 'Content-Type: application/json' \
  --header 'X-CEKURA-API-KEY: <api-key>' \
  --data '
{
  "name": "<string>",
  "description": "<string>",
  "audio_enabled": true,
  "prompt": "<string>",
  "project": 123,
  "agents": [
    123
  ],
  "assistant_id": "<string>",
  "enum_values": [
    "<string>"
  ],
  "display_order": 123,
  "configuration": {},
  "add_to_new_agents": true,
  "simulation_enabled": true,
  "observability_enabled": true,
  "sampling_enabled": true,
  "evaluation_trigger": "always",
  "evaluation_trigger_prompt": "<string>",
  "trigger_type": "llm_judge",
  "evaluation_trigger_custom_code": "<string>",
  "custom_code": "<string>"
}
'
{
  "name": "<string>",
  "description": "<string>",
  "audio_enabled": true,
  "prompt": "<string>",
  "project": 123,
  "assistant_id": "<string>",
  "display_order": 123,
  "configuration": {},
  "id": 123,
  "agents": [
    123
  ],
  "enum_values": [
    "<string>"
  ],
  "add_to_new_agents": true,
  "simulation_enabled": true,
  "observability_enabled": true,
  "sampling_enabled": true,
  "evaluation_trigger": "always",
  "evaluation_trigger_prompt": "<string>",
  "trigger_type": "llm_judge",
  "evaluation_trigger_custom_code": "<string>",
  "custom_code": "<string>"
}

Authorizations

X-CEKURA-API-KEY
string
header
required

API Key Authentication. It should be included in the header of each request.

Path Parameters

id
integer
required

A unique integer value identifying this metric.

Body

name
string

Name of the metric

description
string

Description of what this metric evaluates

audio_enabled
boolean

Whether this metric evaluates audio content

prompt
string

The evaluation prompt used for this metric

project
integer

ID of the project this metric belongs to.

agents
integer[]

List of agent IDs to enable this project-level metric for. Only applicable when project is set.

assistant_id
string

External identifier for the assistant

type
enum<string>

Type of metric (llm_judge recommended; basic and custom_prompt are deprecated)

  • basic - Basic (Deprecated in favor of LLM Judge)
  • custom_prompt - Custom Prompt ( Deprecated in favor of LLM Judge)
  • custom_code - Custom Code
  • llm_judge - LLM Judge
Available options:
basic,
custom_prompt,
custom_code,
llm_judge
eval_type
enum<string>

Output shape of the evaluation score

  • binary_workflow_adherence - Binary Workflow Adherence
  • binary_qualitative - Binary Qualitative
  • continuous_qualitative - Continuous Qualitative
  • numeric - Numeric
  • enum - Enum
Available options:
binary_workflow_adherence,
binary_qualitative,
continuous_qualitative,
numeric,
enum
enum_values
string[]

Possible values for enum-type metrics (list of strings, e.g. ["resolved", "escalated", "abandoned"])

display_order
integer

Order in which to display this metric in the UI

configuration
object

Per-metric configuration.

Some predefined metrics carry per-agent configuration, keyed by agent id: configuration[<key>] = {"<agent_id>": <value>}. On read, every agent attached to the metric is returned; on write, only the agent ids you include are updated.

Call-level keys (not per-agent):

  • Detect Silencesilence_duration (int, seconds, default 10): minimum mutual silence before flagging a failure.
  • Infrastructure Issuesinfra_issues_timeout (int, seconds, default 10): max seconds before the Main Agent must respond after the Testing Agent finishes speaking.

Per-agent keys (each value is {"<agent_id>": <value>}):

  • Dropoff Nodenodes: {"<agent_id>": [{name, description}]} — conversation stages used to classify where a call dropped off.
  • Topic of Callnodes: {"<agent_id>": [{name, description}]} — topic categories used to classify each call.
  • Pronunciation Checkpronunciation_words: {"<agent_id>": [[word, phonemes], ...]} — how specific words should be pronounced (e.g. {"42": [["Cekura", "suh-KYUR-uh"]]}).
  • Letterwise Pronunciationspelling_word_types: {"<agent_id>": ["name", "email", ...]} — word categories the Main Agent must spell out letter-by-letter.
  • Hallucinationhallucination_kb_files: {"<agent_id>": [file_id, ...]} — KnowledgeBaseFile IDs used as the source of truth for fact-checking.
add_to_new_agents
boolean | null

When enabled, this metric is automatically assigned to new agents created in the project.

simulation_enabled
boolean

Enable this metric for simulations. Example: true or false

observability_enabled
boolean

Enable this metric for observability. Example: true or false

sampling_enabled
boolean

Enable sampling for this metric using project-level sample rate

evaluation_trigger
enum<string>
default:always

When to run this metric.

  • always — evaluate every call (default)

  • automatic — system decides based on call content

  • custom — only evaluate when evaluation_trigger_prompt condition is met

  • always - Always

  • automatic - Automatic

  • custom - Custom

Available options:
always,
automatic,
custom
evaluation_trigger_prompt
string

LLM prompt that decides whether to evaluate this call. Only used when evaluation_trigger=custom and trigger_type=llm_judge. Example: "Did the agent offer a refund?"

trigger_type
enum<string>
default:llm_judge

How to evaluate the trigger condition. Only relevant when evaluation_trigger=custom.

  • llm_judge — use evaluation_trigger_prompt (default)

  • custom_code — use evaluation_trigger_custom_code

  • llm_judge - LLM Judge

  • custom_code - Custom Code

Available options:
llm_judge,
custom_code
evaluation_trigger_custom_code
string

Python code to evaluate the trigger condition. Only used when evaluation_trigger=custom and trigger_type=custom_code.

custom_code
string

Python code that implements the metric evaluation. Required when type=custom_code. Must define a function evaluate(transcript, ...) -> bool | float | str.

Response

name
string
required

Name of the metric

description
string
required

Description of what this metric evaluates

audio_enabled
boolean
required

Whether this metric evaluates audio content

prompt
string
required

The evaluation prompt used for this metric

project
integer
required

ID of the project this metric belongs to.

assistant_id
string
required

External identifier for the assistant

type
enum<string>
required

Type of metric (llm_judge recommended; basic and custom_prompt are deprecated)

  • basic - Basic (Deprecated in favor of LLM Judge)
  • custom_prompt - Custom Prompt ( Deprecated in favor of LLM Judge)
  • custom_code - Custom Code
  • llm_judge - LLM Judge
Available options:
basic,
custom_prompt,
custom_code,
llm_judge
eval_type
enum<string>
required

Output shape of the evaluation score

  • binary_workflow_adherence - Binary Workflow Adherence
  • binary_qualitative - Binary Qualitative
  • continuous_qualitative - Continuous Qualitative
  • numeric - Numeric
  • enum - Enum
Available options:
binary_workflow_adherence,
binary_qualitative,
continuous_qualitative,
numeric,
enum
display_order
integer
required

Order in which to display this metric in the UI

configuration
object
required

Per-metric configuration.

Some predefined metrics carry per-agent configuration, keyed by agent id: configuration[<key>] = {"<agent_id>": <value>}. On read, every agent attached to the metric is returned; on write, only the agent ids you include are updated.

Call-level keys (not per-agent):

  • Detect Silencesilence_duration (int, seconds, default 10): minimum mutual silence before flagging a failure.
  • Infrastructure Issuesinfra_issues_timeout (int, seconds, default 10): max seconds before the Main Agent must respond after the Testing Agent finishes speaking.

Per-agent keys (each value is {"<agent_id>": <value>}):

  • Dropoff Nodenodes: {"<agent_id>": [{name, description}]} — conversation stages used to classify where a call dropped off.
  • Topic of Callnodes: {"<agent_id>": [{name, description}]} — topic categories used to classify each call.
  • Pronunciation Checkpronunciation_words: {"<agent_id>": [[word, phonemes], ...]} — how specific words should be pronounced (e.g. {"42": [["Cekura", "suh-KYUR-uh"]]}).
  • Letterwise Pronunciationspelling_word_types: {"<agent_id>": ["name", "email", ...]} — word categories the Main Agent must spell out letter-by-letter.
  • Hallucinationhallucination_kb_files: {"<agent_id>": [file_id, ...]} — KnowledgeBaseFile IDs used as the source of truth for fact-checking.
id
integer
read-only
agents
integer[]

List of agent IDs to enable this project-level metric for. Only applicable when project is set.

enum_values
string[]

Possible values for enum-type metrics (list of strings, e.g. ["resolved", "escalated", "abandoned"])

add_to_new_agents
boolean | null

When enabled, this metric is automatically assigned to new agents created in the project.

simulation_enabled
boolean

Enable this metric for simulations. Example: true or false

observability_enabled
boolean

Enable this metric for observability. Example: true or false

sampling_enabled
boolean

Enable sampling for this metric using project-level sample rate

evaluation_trigger
enum<string>
default:always

When to run this metric.

  • always — evaluate every call (default)

  • automatic — system decides based on call content

  • custom — only evaluate when evaluation_trigger_prompt condition is met

  • always - Always

  • automatic - Automatic

  • custom - Custom

Available options:
always,
automatic,
custom
evaluation_trigger_prompt
string

LLM prompt that decides whether to evaluate this call. Only used when evaluation_trigger=custom and trigger_type=llm_judge. Example: "Did the agent offer a refund?"

trigger_type
enum<string>
default:llm_judge

How to evaluate the trigger condition. Only relevant when evaluation_trigger=custom.

  • llm_judge — use evaluation_trigger_prompt (default)

  • custom_code — use evaluation_trigger_custom_code

  • llm_judge - LLM Judge

  • custom_code - Custom Code

Available options:
llm_judge,
custom_code
evaluation_trigger_custom_code
string

Python code to evaluate the trigger condition. Only used when evaluation_trigger=custom and trigger_type=custom_code.

custom_code
string

Python code that implements the metric evaluation. Required when type=custom_code. Must define a function evaluate(transcript, ...) -> bool | float | str.