Skip to main content
GET
/
test_framework
/
v1
/
results
/
{id}
cURL
curl --request GET \
  --url https://api.cekura.ai/test_framework/v1/results/{id}/ \
  --header 'X-CEKURA-API-KEY: <api-key>'
{
  "id": "integer",
  "name": "string",
  "agent": "integer",
  "status": "string",
  "met_expected_outcome_count": "integer",
  "total_expected_outcome_count": "integer",
  "success_rate": "float",
  "run_as_text": "boolean",
  "is_cronjob": "boolean",
  "runs": {
    "run_id": {
      "id": "integer",
      "scenario": "integer",
      "outbound_number": "string",
      "expected_outcome": {
        "score": 100,
        "explanation": [
          "✅ Positive outcome explanation with checkmark emoji",
          "❌ Negative outcome explanation with X emoji"
        ],
        "outcome_alignments": [
          {
            "outcome": "string",
            "prompt_part": "string",
            "aligned": "boolean"
          }
        ]
      },
      "success": "boolean",
      "evaluation": {
        "metrics": [
          {
            "id": "integer",
            "name": "string",
            "type": "binary_workflow_adherence | binary_qualitative | continuous_qualitative | numeric | enum",
            "score": "number",
            "explanation": "string | array",
            "function_name": "string (optional)",
            "extra": {
              "categories": [
                {
                  "category": "string",
                  "deviation": "string (optional)",
                  "priority": "string (optional)"
                }
              ],
              "percentiles": {
                "p50": "number"
              }
            },
            "enum": "string (for enum type metrics only)"
          }
        ]
      },
      "timestamp": "datetime",
      "executed_at": "datetime",
      "error_message": "string",
      "status": "string",
      "duration": "string (MM:SS format)",
      "scenario_name": "string",
      "personality_name": "string",
      "metadata": "object",
      "inbound_number": "string"
    }
  },
  "overall_evaluation": {
    "success_rate": "number",
    "metric_summary": {
      "metric_id": {
        "id": "integer",
        "name": "string",
        "type": "string",
        "score": "number",
        "explanation": "string (optional)",
        "function_name": "string",
        "vocera_defined_metric_code": "string (optional)",
        "p50": "number (for numeric metrics)"
      }
    },
    "worst_performing_metrics": {
      "binary_adherence": [
        "array of metric_ids"
      ]
    },
    "numeric_metrics": [
      {
        "name": "string",
        "type": "numeric",
        "value": "number",
        "percentiles": {
          "p50": "number"
        }
      }
    ],
    "enum_metrics": [
      "array of metric_ids"
    ],
    "extra_metrics": [
      {
        "name": "string (e.g., 'Expected Outcome', 'Average Ringing Duration')",
        "type": "string",
        "value": "number",
        "percentiles": {
          "p50": "number (optional)"
        }
      }
    ]
  },
  "total_duration": "string (MM:SS format)",
  "total_runs_count": "integer",
  "completed_runs_count": "integer",
  "success_runs_count": "integer",
  "failed_runs_count": "integer",
  "scenarios": [
    {
      "id": "integer",
      "name": "string"
    }
  ],
  "critical_categories": "array",
  "metrics": "array",
  "domain": "string (nullable)",
  "domain_logo": "string (nullable)",
  "runs_by_tags": "object",
  "latency_data": "object",
  "failed_reasons": "array",
  "created_at": "datetime",
  "updated_at": "datetime"
}

Authorizations

X-CEKURA-API-KEY
string
header
required

API Key Authentication. It should be included in the header of each request.

Path Parameters

id
integer
required

A unique integer value identifying this result.

Response

id
integer
name
string

Name of the result Example: "Test Result 1"

Maximum string length: 255
agent
integer
status
enum<string>

Current status of the result

  • running - Running
  • completed - Completed
  • failed - Failed
  • pending - Pending
  • in_progress - In Progress
  • evaluating - Evaluating
  • in_queue - In Queue
  • timeout - Timeout
  • cancelled - Cancelled
Available options:
running,
completed,
failed,
pending,
in_progress,
evaluating,
in_queue,
timeout,
cancelled
met_expected_outcome_count
string

Number of runs that fully met their expected outcomes with a score of 100

total_expected_outcome_count
string

Total number of runs that had expected outcomes defined

success_rate
number<double>

Success rate of the test runs

run_as_text
boolean

Whether this test was run in text mode instead of voice mode Example: true or false

is_cronjob
string

Whether this result was created by a scheduled cronjob

runs
object

Runs

{
"<run_id>": {
"id": "<integer>",
"scenario": "<integer>",
"expected_outcome": {
"score": 0,
"explanation": [
"❌ The main agent did not provide the standard greeting, emergency disclaimer, or ask how they can help (00:21).",
"❌ The main agent did not explain the distinction between Nacogdoches Health Partners and FastTrack Express Clinic (entire call)."
],
"outcome_alignments": [
{
"aligned": false,
"outcome": "The main agent did not provide the standard greeting, emergency disclaimer, or ask how they can help (00:21).",
"prompt_part": "The main agent should provide the standard greeting and emergency disclaimer, asking how they can help."
},
{
"aligned": false,
"outcome": "The main agent did not explain the distinction between Nacogdoches Health Partners and FastTrack Express Clinic (entire call).",
"prompt_part": "The main agent should explain the distinction between Nacogdoches Health Partners (focused on comprehensive primary care and chronic conditions) and FastTrack Express Clinic (focused on acute or urgent matters)."
}
]
}
}
overall_evaluation
any | null

Overall evaluation of the test runs Example:

{
"success_rate": "number",
"metric_summary": {
"metric_id": {
"id": "integer",
"name": "string",
"type": "string",
"score": "number",
"explanation": "string (optional)",
"function_name": "string",
"vocera_defined_metric_code": "string (optional)",
"p50": "number (for numeric metrics)"
}
},
"worst_performing_metrics": {
"binary_adherence": [
"array of metric_ids"
]
},
"numeric_metrics": [
{
"name": "string",
"type": "numeric",
"value": "number",
"percentiles": {
"p50": "number"
}
}
],
"enum_metrics": [
"array of metric_ids"
],
"extra_metrics": [
{
"name": "string (e.g., 'Expected Outcome', 'Average Ringing Duration')",
"type": "string",
"value": "number",
"percentiles": {
"p50": "number (optional)"
}
}
]
}
total_duration
string

Total duration of the test runs for this result Example: 22:30

total_runs_count
string

Total number of test runs associated with this result Example: 10

completed_runs_count
string

Number of test runs that have completed successfully Example: 10

success_runs_count
string

Number of test runs that were marked as successful Example: 10

failed_runs_count
string

Number of test runs that failed or encountered errors Example: 10

connected_runs
string

List of run IDs that got connected successfully (have transcript data). Returns empty list if no connected runs.

failed_infrastructure_runs
string

List of run IDs that failed infrastructure issues metric (score=0). Returns null if metric not found, empty list if metric exists but no failures.

failed_workflow_runs
string

List of run IDs that failed expected outcome metric (score=0). Returns null if metric not found, empty list if metric exists but no failures.

successful_calls
string

List of run IDs that completed successfully. Returns empty list if no successful runs.

scenarios
string

List of scenario names used in the test runs for this result Example: ```

[
{
"id": 123,
"name": "Scenario 1"
},
{
"id": 456,
"name": "Scenario 2"
}
]
critical_categories
string

List of critical categories for this result Example:

[
{
"id": 2950,
"name": "Pronunciation Analysis",
"eval_type": "continuous_qualitative",
"simulation_enabled": true,
"observability_enabled": true
},
{
"id": 3284,
"name": "Latency",
"eval_type": "numeric",
"simulation_enabled": true,
"observability_enabled": false
},
{
"id": 3295,
"name": "Detect Silence in Conversation",
"eval_type": "binary_qualitative",
"simulation_enabled": true,
"observability_enabled": true
}
]
metrics
string
runs_by_tags
string
latency_data
string
failed_reasons
any | null

Failed reasons of the test runs Example:

{
"issues": [
{
"rank": 1,
"run_ids": [
34588
],
"description": "The agent did not provide the standard greeting, emergency disclaimer, or ask how they could help.",
"affected_count": 1
},
{
"rank": 2,
"run_ids": [
34588
],
"description": "The agent did not explain the distinction between the primary care and express clinics.",
"affected_count": 1
}
],
"total_failed_runs": 1
}
created_at
string<date-time>

Timestamp when this test result was created Example: 2021-01-01 00:00:00

updated_at
string<date-time>

Timestamp when this test result was last updated Example: 2021-01-01 00:00:00