Skip to main content
GET
/
test_framework
/
v1
/
results
/
{id}
Retrieve a test result
curl --request GET \
  --url https://api.cekura.ai/test_framework/v1/results/{id}/ \
  --header 'X-CEKURA-API-KEY: <api-key>'
{
  "id": "integer",
  "name": "string",
  "agent": "integer",
  "status": "string",
  "met_expected_outcome_count": "integer",
  "total_expected_outcome_count": "integer",
  "success_rate": "float",
  "run_as_text": "boolean",
  "is_cronjob": "boolean",
  "runs": {
    "run_id": {
      "id": "integer",
      "scenario": "integer",
      "outbound_number": "string",
      "expected_outcome": {
        "score": 100,
        "explanation": [
          "✅ Positive outcome explanation with checkmark emoji",
          "❌ Negative outcome explanation with X emoji"
        ],
        "outcome_alignments": [
          {
            "outcome": "string",
            "prompt_part": "string",
            "aligned": "boolean"
          }
        ]
      },
      "success": "boolean",
      "evaluation": {
        "metrics": [
          {
            "id": "integer",
            "name": "string",
            "type": "binary_workflow_adherence | binary_qualitative | continuous_qualitative | numeric | enum",
            "score": "number",
            "explanation": "string | array",
            "function_name": "string (optional)",
            "extra": {
              "categories": [
                {
                  "category": "string",
                  "deviation": "string (optional)",
                  "priority": "string (optional)"
                }
              ],
              "percentiles": {
                "p50": "number"
              }
            },
            "enum": "string (for enum type metrics only)"
          }
        ]
      },
      "timestamp": "datetime",
      "executed_at": "datetime",
      "error_message": "string",
      "status": "string",
      "duration": "string (MM:SS format)",
      "scenario_name": "string",
      "personality_name": "string",
      "metadata": "object",
      "inbound_number": "string"
    }
  },
  "overall_evaluation": {
    "success_rate": "number",
    "metric_summary": {
      "metric_id": {
        "id": "integer",
        "name": "string",
        "type": "string",
        "score": "number",
        "explanation": "string (optional)",
        "function_name": "string",
        "vocera_defined_metric_code": "string (optional)",
        "p50": "number (for numeric metrics)"
      }
    },
    "worst_performing_metrics": {
      "binary_adherence": [
        "array of metric_ids"
      ]
    },
    "numeric_metrics": [
      {
        "name": "string",
        "type": "numeric",
        "value": "number",
        "percentiles": {
          "p50": "number"
        }
      }
    ],
    "enum_metrics": [
      "array of metric_ids"
    ],
    "extra_metrics": [
      {
        "name": "string (e.g., 'Expected Outcome', 'Average Ringing Duration')",
        "type": "string",
        "value": "number",
        "percentiles": {
          "p50": "number (optional)"
        }
      }
    ]
  },
  "total_duration": "string (MM:SS format)",
  "total_runs_count": "integer",
  "completed_runs_count": "integer",
  "success_runs_count": "integer",
  "failed_runs_count": "integer",
  "scenarios": [
    {
      "id": "integer",
      "name": "string"
    }
  ],
  "critical_categories": "array",
  "metrics": "array",
  "domain": "string (nullable)",
  "domain_logo": "string (nullable)",
  "runs_by_tags": "object",
  "latency_data": "object",
  "failed_reasons": "array",
  "run_settings": {
    "override_value": 10,
    "frequency": 2,
    "concurrency_limit": 5,
    "personality_ids": [
      3,
      7
    ],
    "test_profile_ids": [
      20
    ],
    "mode": "same_number",
    "livekit_data": {
      "agent_name": "kit",
      "config": {},
      "url": "wss://example"
    }
  },
  "created_at": "datetime",
  "updated_at": "datetime"
}

Authorizations

X-CEKURA-API-KEY
string
header
required

API Key Authentication. It should be included in the header of each request.

Path Parameters

id
integer
required

A unique integer value identifying this result.

Response

agent
integer
required
id
integer
read-only
name
string

Name of the result Example: "Test Result 1"

Maximum string length: 255
agent_name
string
read-only

Name of the agent associated with this result

status
enum<string>
read-only

Current status of the result

  • running - Running
  • completed - Completed
  • failed - Failed
  • pending - Pending
  • in_progress - In Progress
  • evaluating - Evaluating
  • in_queue - In Queue
  • timeout - Timeout
  • cancelled - Cancelled
  • scaling_up - Scaling Up
Available options:
running,
completed,
failed,
pending,
in_progress,
evaluating,
in_queue,
timeout,
cancelled,
scaling_up
met_expected_outcome_count
string
read-only

Number of runs that fully met their expected outcomes with a score of 100

total_expected_outcome_count
string
read-only

Total number of runs that had expected outcomes defined

success_rate
number<double>
read-only

Success rate of the test runs

run_as_text
boolean
read-only

Whether this test was run in text mode instead of voice mode Example: true or false

is_cronjob
string
read-only

Whether this result was created by a scheduled cronjob

runs
object
read-only

Run objects keyed by run ID string (e.g. {"12345": {...}}). The list endpoint returns runs as an array of summaries instead.

overall_evaluation
any | null

Overall evaluation of the test runs Example:

{
"success_rate": "number",
"metric_summary": {
"metric_id": {
"id": "integer",
"name": "string",
"type": "string",
"score": "number",
"explanation": "string (optional)",
"function_name": "string",
"vocera_defined_metric_code": "string (optional)",
"p50": "number (for numeric metrics)"
}
},
"worst_performing_metrics": {
"binary_adherence": [
"array of metric_ids"
]
},
"numeric_metrics": [
{
"name": "string",
"type": "numeric",
"value": "number",
"percentiles": {
"p50": "number"
}
}
],
"enum_metrics": [
"array of metric_ids"
],
"extra_metrics": [
{
"name": "string (e.g., 'Expected Outcome', 'Average Ringing Duration')",
"type": "string",
"value": "number",
"percentiles": {
"p50": "number (optional)"
}
}
]
}
total_duration
string
read-only

Total duration of the test runs for this result Example: 22:30

total_runs_count
string
read-only

Total number of test runs associated with this result Example: 10

completed_runs_count
string
read-only

Number of test runs that have completed successfully Example: 10

success_runs_count
string
read-only

Number of test runs that were marked as successful Example: 10

failed_runs_count
string
read-only

Number of test runs that failed or encountered errors Example: 10

connected_runs
string
read-only

List of run IDs that got connected successfully (have transcript data). Returns empty list if no connected runs.

failed_infrastructure_runs
string
read-only

List of run IDs that failed infrastructure issues metric (score=0). Returns null if metric not found, empty list if metric exists but no failures.

failed_workflow_runs
string
read-only

List of run IDs that failed expected outcome metric (score=0). Returns null if metric not found, empty list if metric exists but no failures.

successful_calls
string
read-only

List of run IDs that completed successfully. Returns empty list if no successful runs.

scenarios
string
read-only

List of scenario names used in the test runs for this result Example: ```

[
{
"id": 123,
"name": "Scenario 1"
},
{
"id": 456,
"name": "Scenario 2"
}
]
critical_categories
string
read-only

List of critical categories for this result Example:

[
{
"id": 2950,
"name": "Pronunciation Analysis",
"eval_type": "continuous_qualitative",
"simulation_enabled": true,
"observability_enabled": true
},
{
"id": 3284,
"name": "Latency",
"eval_type": "numeric",
"simulation_enabled": true,
"observability_enabled": false
},
{
"id": 3295,
"name": "Detect Silence in Conversation",
"eval_type": "binary_qualitative",
"simulation_enabled": true,
"observability_enabled": true
}
]
metrics
string
read-only
runs_by_tags
string
read-only
latency_data
string
read-only
failed_reasons
any | null

Failed reasons of the test runs Example:

{
"issues": [
{
"rank": 1,
"run_ids": [
34588
],
"description": "The agent did not provide the standard greeting, emergency disclaimer, or ask how they could help.",
"affected_count": 1
},
{
"rank": 2,
"run_ids": [
34588
],
"description": "The agent did not explain the distinction between the primary care and express clinics.",
"affected_count": 1
}
],
"total_failed_runs": 1
}
run_type
string | null
read-only

Run Execution Type

run_settings
any | null

Snapshot of the override-relevant request payload (personality_ids, frequency, concurrency_limit, connection-specific overrides, etc.).

created_at
string<date-time>

Timestamp when this test result was created Example: 2021-01-01 00:00:00

updated_at
string<date-time>
read-only

Timestamp when this test result was last updated Example: 2021-01-01 00:00:00