Skip to main content
POST
/
test_framework
/
v1
/
metrics
Create a metric
curl --request POST \
  --url https://api.cekura.ai/test_framework/v1/metrics/ \
  --header 'Content-Type: application/json' \
  --header 'X-CEKURA-API-KEY: <api-key>' \
  --data '
{
  "name": "<string>",
  "description": "<string>",
  "audio_enabled": true,
  "prompt": "<string>",
  "project": 123,
  "assistant_id": "<string>",
  "display_order": 123,
  "configuration": {},
  "agents": [
    123
  ],
  "enum_values": [
    "<string>"
  ],
  "add_to_new_agents": true,
  "simulation_enabled": true,
  "observability_enabled": true,
  "sampling_enabled": true,
  "evaluation_trigger": "always",
  "evaluation_trigger_prompt": "<string>",
  "trigger_type": "llm_judge",
  "evaluation_trigger_custom_code": "<string>",
  "custom_code": "<string>"
}
'
{
  "name": "<string>",
  "description": "<string>",
  "audio_enabled": true,
  "prompt": "<string>",
  "project": 123,
  "assistant_id": "<string>",
  "display_order": 123,
  "configuration": {},
  "id": 123,
  "agents": [
    123
  ],
  "enum_values": [
    "<string>"
  ],
  "add_to_new_agents": true,
  "simulation_enabled": true,
  "observability_enabled": true,
  "sampling_enabled": true,
  "evaluation_trigger": "always",
  "evaluation_trigger_prompt": "<string>",
  "trigger_type": "llm_judge",
  "evaluation_trigger_custom_code": "<string>",
  "custom_code": "<string>"
}

Documentation Index

Fetch the complete documentation index at: https://docs.cekura.ai/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

X-CEKURA-API-KEY
string
header
required

API Key Authentication. It should be included in the header of each request.

Body

name
string
required

Name of the metric

description
string
required

Description of what this metric evaluates

audio_enabled
boolean
required

Whether this metric evaluates audio content

prompt
string
required

The evaluation prompt used for this metric

project
integer
required

ID of the project this metric belongs to.

assistant_id
string
required

External identifier for the assistant

type
enum<string>
required

Type of metric (llm_judge recommended; basic and custom_prompt are deprecated)

  • basic - Basic (Deprecated in favor of LLM Judge)
  • custom_prompt - Custom Prompt ( Deprecated in favor of LLM Judge)
  • custom_code - Custom Code
  • llm_judge - LLM Judge
Available options:
basic,
custom_prompt,
custom_code,
llm_judge
eval_type
enum<string>
required

Output shape of the evaluation score

  • binary_workflow_adherence - Binary Workflow Adherence
  • binary_qualitative - Binary Qualitative
  • continuous_qualitative - Continuous Qualitative
  • numeric - Numeric
  • enum - Enum
Available options:
binary_workflow_adherence,
binary_qualitative,
continuous_qualitative,
numeric,
enum
display_order
integer
required

Order in which to display this metric in the UI

configuration
object
required

Custom configuration parameters for specific metrics. For pronounciation metric, you can set words as 2-tuple (word, phonemes) list example:

{
    "words": [["hello", "hɛl.loʊ"], ["world", "wɝɚɚɚld"]]
}
agents
integer[]

List of agent IDs to enable this project-level metric for. Only applicable when project is set.

enum_values
string[]

Possible values for enum-type metrics (list of strings, e.g. ["resolved", "escalated", "abandoned"])

add_to_new_agents
boolean | null

When enabled, this metric is automatically assigned to new agents created in the project.

simulation_enabled
boolean

Enable this metric for simulations. Example: true or false

observability_enabled
boolean

Enable this metric for observability. Example: true or false

sampling_enabled
boolean

Enable sampling for this metric using project-level sample rate

evaluation_trigger
enum<string>
default:always

When to run this metric.

  • always — evaluate every call (default)

  • automatic — system decides based on call content

  • custom — only evaluate when evaluation_trigger_prompt condition is met

  • always - Always

  • automatic - Automatic

  • custom - Custom

Available options:
always,
automatic,
custom
evaluation_trigger_prompt
string

LLM prompt that decides whether to evaluate this call. Only used when evaluation_trigger=custom and trigger_type=llm_judge. Example: "Did the agent offer a refund?"

trigger_type
enum<string>
default:llm_judge

How to evaluate the trigger condition. Only relevant when evaluation_trigger=custom.

  • llm_judge — use evaluation_trigger_prompt (default)

  • custom_code — use evaluation_trigger_custom_code

  • llm_judge - LLM Judge

  • custom_code - Custom Code

Available options:
llm_judge,
custom_code
evaluation_trigger_custom_code
string

Python code to evaluate the trigger condition. Only used when evaluation_trigger=custom and trigger_type=custom_code.

custom_code
string

Python code that implements the metric evaluation. Required when type=custom_code. Must define a function evaluate(transcript, ...) -> bool | float | str.

Response

name
string
required

Name of the metric

description
string
required

Description of what this metric evaluates

audio_enabled
boolean
required

Whether this metric evaluates audio content

prompt
string
required

The evaluation prompt used for this metric

project
integer
required

ID of the project this metric belongs to.

assistant_id
string
required

External identifier for the assistant

type
enum<string>
required

Type of metric (llm_judge recommended; basic and custom_prompt are deprecated)

  • basic - Basic (Deprecated in favor of LLM Judge)
  • custom_prompt - Custom Prompt ( Deprecated in favor of LLM Judge)
  • custom_code - Custom Code
  • llm_judge - LLM Judge
Available options:
basic,
custom_prompt,
custom_code,
llm_judge
eval_type
enum<string>
required

Output shape of the evaluation score

  • binary_workflow_adherence - Binary Workflow Adherence
  • binary_qualitative - Binary Qualitative
  • continuous_qualitative - Continuous Qualitative
  • numeric - Numeric
  • enum - Enum
Available options:
binary_workflow_adherence,
binary_qualitative,
continuous_qualitative,
numeric,
enum
display_order
integer
required

Order in which to display this metric in the UI

configuration
object
required

Custom configuration parameters for specific metrics. For pronounciation metric, you can set words as 2-tuple (word, phonemes) list example:

{
    "words": [["hello", "hɛl.loʊ"], ["world", "wɝɚɚɚld"]]
}
id
integer
read-only
agents
integer[]

List of agent IDs to enable this project-level metric for. Only applicable when project is set.

enum_values
string[]

Possible values for enum-type metrics (list of strings, e.g. ["resolved", "escalated", "abandoned"])

add_to_new_agents
boolean | null

When enabled, this metric is automatically assigned to new agents created in the project.

simulation_enabled
boolean

Enable this metric for simulations. Example: true or false

observability_enabled
boolean

Enable this metric for observability. Example: true or false

sampling_enabled
boolean

Enable sampling for this metric using project-level sample rate

evaluation_trigger
enum<string>
default:always

When to run this metric.

  • always — evaluate every call (default)

  • automatic — system decides based on call content

  • custom — only evaluate when evaluation_trigger_prompt condition is met

  • always - Always

  • automatic - Automatic

  • custom - Custom

Available options:
always,
automatic,
custom
evaluation_trigger_prompt
string

LLM prompt that decides whether to evaluate this call. Only used when evaluation_trigger=custom and trigger_type=llm_judge. Example: "Did the agent offer a refund?"

trigger_type
enum<string>
default:llm_judge

How to evaluate the trigger condition. Only relevant when evaluation_trigger=custom.

  • llm_judge — use evaluation_trigger_prompt (default)

  • custom_code — use evaluation_trigger_custom_code

  • llm_judge - LLM Judge

  • custom_code - Custom Code

Available options:
llm_judge,
custom_code
evaluation_trigger_custom_code
string

Python code to evaluate the trigger condition. Only used when evaluation_trigger=custom and trigger_type=custom_code.

custom_code
string

Python code that implements the metric evaluation. Required when type=custom_code. Must define a function evaluate(transcript, ...) -> bool | float | str.