Pipecat Tracing

Using a coding agent? Paste this prompt directly in your Pipecat agent codebase to jumpstart your Cekura integration.

Overview

Pipecat Tracing provides deep observability into your Pipecat agent’s performance by integrating the Cekura Python SDK directly into your agent code. This integration significantly enhances the information available in the Cekura platform for end-to-end visibility over agent execution. What you get:

Complete agent-side conversation transcripts with UTC timestamps
Tool/function calls with inputs and outputs
Dual-channel audio recording (agent + user) for production monitoring
OpenTelemetry traces with conversation, turn, and service spans (STT, LLM, TTS)
Session logs capture from agent
Session metadata for correlation and debugging

Replacing an existing direct API integration? If you currently post call data to the POST /observability/v1/observe/ endpoint directly, the Pipecat SDK is a replacement, not an add-on — these two approaches are mutually exclusive for the same session. Running both creates duplicate records with no merging. Migrate by moving your metadata to set_custom_metadata() and removing the direct API call. See Migrating from the direct API to the SDK for step-by-step instructions.

Video Tutorial

Watch this video to see the Pipecat tracing integration in action:

Prerequisites

A Cekura account with an API key
A Pipecat agent project

Setup

Testing
Observability

Use this setup in your test agents while running simulation calls from the Cekura platform. This mode captures transcripts, logs, and OTel traces — no audio recording.

Install the Cekura Python SDK

pip install "cekura[pipecat]>=1.5.2"

Integrate the SDK in your Pipecat agent

Add the Cekura tracer to your Pipecat agent:

Initialize the tracer with your API key and agent ID
Track your pipeline to capture transcripts, tool calls, logs, and OTel traces
Register task handlers for automatic cleanup

import os
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.audio.vad.silero import SileroVADAnalyzer

from cekura.pipecat import PipecatTracer

async def run_bot(transport, runner_args):
    # Build your pipeline with aggregators
    context = LLMContext(tools=tools)
    user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
        context,
        user_params=LLMUserAggregatorParams(
            vad_analyzer=SileroVADAnalyzer(),
        ),
    )

    pipeline = Pipeline([
        transport.input(),
        stt,
        user_aggregator,
        llm,
        tts,
        transport.output(),
        assistant_aggregator,
    ])

    # Initialize Cekura tracer
    cekura = PipecatTracer(
        api_key=os.getenv("CEKURA_API_KEY"),
        agent_id=123,  # Your agent ID from Cekura dashboard
    )

    # Track pipeline (transcript only, no audio)
    pipeline = cekura.track_pipeline(pipeline, context, runner_args=runner_args)

    # Create task with OTel tracing enabled
    task = PipelineTask(pipeline, enable_tracing=True, enable_turn_tracking=True)

    # Register task handlers for automatic cleanup
    task = cekura.register_task_handlers(task, transport=transport)

    # Run pipeline as usual
    runner = PipelineRunner()
    await runner.run(task)

What this does:

Captures transcripts and tool calls
Captures session logs (INFO+)
Exports OpenTelemetry traces (conversation, turn, STT/LLM/TTS spans)
Sends data to Cekura’s simulation webhook for test analysis
No audio recording — lightweight for testing

For simpler setups, you can combine all steps into one call:

task = cekura.track_and_create_task(
    pipeline, context, runner_args=runner_args, transport=transport,
    custom_metadata={"bot_version": "1.0"},  # optional
    **pipeline_task_kwargs  # additional args forwarded to PipelineTask
)

Configure Pipecat provider and enable tracing

Navigate to your agent settings in the Cekura dashboard, select Pipecat as the provider, and enable tracing:Required configuration:

Provider: Select “Pipecat” as your voice integration provider
Pipecat Cloud API Key: Your authentication key from Pipecat Cloud
Pipecat Agent Name: Your agent name in Pipecat (e.g., “my-agent”)
Tracing Enabled: Set to true to enable SDK-based tracing

Testing connection types:Configure at least one connection type for voice-based testing:

WebRTC: Direct Pipecat session connection using the credentials configured above
Telephony: Phone-based testing if your Pipecat agent is connected to a phone system (requires Contact Number)

See the Pipecat Automated Testing guide for details on all configuration fields including Agent Configuration JSON and Room Properties.

Run tests

Run tests using your preferred connection type:

WebRTC: Select WebRTC under Voice connections in the Configure Run dialog for WebRTC-based testing
Telephony: Select Telephony under Voice connections for phone-based testing

Test results will appear in the Runs section with enhanced data including transcripts, tool calls, logs, and OTel traces.

Use this setup in your production agents for monitoring live calls with audio recording, logs, and OTel traces.

Install the Cekura Python SDK

pip install "cekura[pipecat]>=1.5.2"

Integrate the SDK in your Pipecat agent

Add the Cekura tracer to your Pipecat agent:

Initialize the tracer with your API key and agent ID
Observe your pipeline to capture transcripts, tool calls, audio, logs, and OTel traces
Register task handlers for audio recording and automatic cleanup

import os
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.processors.aggregators.llm_context import LLMContext
from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.audio.vad.silero import SileroVADAnalyzer

from cekura.pipecat import PipecatTracer

async def run_bot(transport, runner_args):
    # Build your pipeline with aggregators
    context = LLMContext(tools=tools)
    user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
        context,
        user_params=LLMUserAggregatorParams(
            vad_analyzer=SileroVADAnalyzer(),
        ),
    )

    pipeline = Pipeline([
        transport.input(),
        stt,
        user_aggregator,
        llm,
        tts,
        transport.output(),
        assistant_aggregator,
    ])

    # Initialize Cekura tracer
    cekura = PipecatTracer(
        api_key=os.getenv("CEKURA_API_KEY"),
        agent_id=123,  # Your agent ID from Cekura dashboard
    )

    # Observe pipeline (transcript + audio recording)
    pipeline = cekura.observe_pipeline(pipeline, context, runner_args=runner_args)

    # Create task with OTel tracing enabled
    task = PipelineTask(pipeline, enable_tracing=True, enable_turn_tracking=True)

    # Register task handlers for audio recording
    task = cekura.register_task_handlers(task, transport=transport)

    # Run pipeline as usual
    runner = PipelineRunner()
    await runner.run(task)

What this does:

Captures transcripts, tool calls, and session metadata
Records dual-channel audio (agent + user) for analysis
Captures session logs (INFO+)
Exports OpenTelemetry traces (conversation, turn, STT/LLM/TTS spans)
Exports data to Cekura observability endpoint

observe_pipeline() registers its own audio frame processor and performs a parallel recording independently of any audio capture you already have in your pipeline. The two recordings are completely separate — the SDK’s recording is asynchronous and does not add latency. If you already record audio and want to understand the impact, see Will the Cekura SDK record audio again if I already capture it? in the FAQ. To avoid recording entirely or control it per call, see Can I disable or selectively control audio recording?.

For simpler setups, you can combine all steps into one call:

task = cekura.observe_and_create_task(
    pipeline, context, runner_args=runner_args, transport=transport,
    custom_metadata={"bot_version": "1.0"},  # optional
    **pipeline_task_kwargs  # additional args forwarded to PipelineTask
)

Monitor production calls

Once the SDK is configured in your agent code, all production calls will automatically appear in the Calls section of your Cekura dashboard with enhanced data. No additional dashboard configuration is required for observability mode.

Enhanced Data in Cekura UI

With tracing enabled, you’ll see enriched information in the Cekura platform:

Complete Transcript: Full conversation history with UTC timestamps
Tool Calls: Function call requests and responses with inputs and outputs
OpenTelemetry Traces: Conversation, turn, and service-level spans (STT, LLM, TTS) with latency and usage metrics
Session Logs: INFO+ logs captured during the session for debugging
Session Metadata: Session IDs for correlating Cekura calls with your logs
Audio Recording: Dual-channel recordings (agent + user) for quality monitoring (observability mode only)
Custom Metadata: Additional metadata passed via custom_metadata parameter

OpenTelemetry Tracing

The SDK automatically exports OpenTelemetry traces for every call, giving you detailed visibility into your agent’s execution pipeline. Traces are exported to the Cekura platform by default and correlated with call logs automatically.

Trace Structure

Each call produces a hierarchical trace:

Conversation
├── Turn 1
│   ├── stt (Deepgram, Google, etc.)
│   ├── llm (OpenAI, Gemini, etc.) — includes token usage, model, tools
│   └── tts (Cartesia, ElevenLabs, etc.) — includes character count
├── Turn 2
│   ├── stt
│   ├── llm → tool_call
│   ├── llm → tool_result
│   └── tts
└── Turn N
    └── ...

Each span includes service-specific attributes like model name, token usage, latency, and TTFB metrics.

How It Works

When using the single-step API (observe_and_create_task / track_and_create_task), OTel tracing is configured automatically — no additional code needed. When using the multi-step API (observe_pipeline / track_pipeline), you need to enable tracing on the PipelineTask yourself:

task = PipelineTask(pipeline, enable_tracing=True, enable_turn_tracking=True)

The SDK sets up the OTel provider in both cases. In the multi-step API, it’s ready for Pipecat to use once you enable tracing on the task.

Additional Span Attributes

You can attach custom attributes to the conversation-level span via additional_span_attributes on the PipelineTask. This is useful for adding your own correlation IDs or metadata to traces.

# Single-step: pass as pipeline_task_kwargs
task = cekura.observe_and_create_task(
    pipeline, context, runner_args=runner_args, transport=transport,
    additional_span_attributes={
        "user.id": "user_123",
        "session.type": "support_call",
    }
)

# Multi-step: pass directly to PipelineTask
task = PipelineTask(
    pipeline,
    enable_tracing=True,
    enable_turn_tracking=True,
    additional_span_attributes={
        "user.id": "user_123",
        "session.type": "support_call",
    }
)

Disabling OTel Traces

To disable OTel trace export entirely:

cekura = PipecatTracer(
    api_key="...",
    agent_id=123,
    enable_otel_traces=False,
)

When disabled, no OTel provider is set up and no traces are exported. All other SDK features (transcripts, audio, logs) continue to work normally.

Custom Metadata

Pass custom metadata at initialization or update it anytime before the session tears down:

# At initialization
task = cekura.observe_and_create_task(
    pipeline, context, runner_args=runner_args, transport=transport,
    custom_metadata={"bot_version": "1.0", "environment": "staging"}
)

# Or update anytime before the session ends (including post-processing handlers
# that run before the pipeline closes)
cekura.set_custom_metadata({"bot_version": "1.0", "environment": "staging"})

set_custom_metadata() can be called at any point before the pipeline session finalizes — including in cleanup or post-processing handlers that execute before the session tears down. Once the session has ended and data has been submitted to Cekura, the record is immutable and metadata cannot be updated.

Deferred Upload (Compliance)

For compliance scenarios where you need user consent before sending any data to the backend, use defer_upload=True. All audio is buffered locally and no data (audio, transcript, logs) is sent until you explicitly grant consent.

pipeline = cekura.observe_pipeline(pipeline, context, runner_args=runner_args, defer_upload=True)

# Grant consent — flushes buffered audio and enables upload:
await cekura.start_audio_upload()

# Or abort mid-call to discard everything immediately:
await cekura.abort()

If neither method is called before the session ends, all data is automatically discarded. Nothing is sent to the backend. Also works with the single-step API:

task = cekura.observe_and_create_task(
    pipeline, context, runner_args=runner_args, transport=transport,
    defer_upload=True,
)

If you currently send call data to observability/v1/observe/ and want to adopt the SDK — or are wondering whether both can run simultaneously — see Can I also add the Pipecat SDK if I already post to the observability endpoint? in the FAQ.If you already capture audio in your own pipeline and want to understand how the SDK’s parallel recording works, see If I already capture and store audio in my own Pipecat pipeline, will the Cekura SDK record it again? in the FAQ.

Best Practices

Use the right method for your environment: Use track_pipeline() / track_and_create_task() in your test/UAT environments for simulation testing. Use observe_pipeline() / observe_and_create_task() in your production environment for monitoring live calls with audio recording.
Use environment variables for credentials: Don’t hardcode API keys in your code
Keep the SDK updated: Run pip install --upgrade cekura periodically for the latest features
Use session IDs: The SDK resolves the session ID in this order: explicit session_id parameter > runner_args.session_id > auto-generated. Pass a custom session ID to correlate Cekura data with your application logs
Create one PipecatTracer instance per call — do not share instances across concurrent sessions: PipecatTracer is not thread-safe to share. When running multiple concurrent calls in the same event loop, instantiate a fresh PipecatTracer for each call (with its own session_id) inside the function that handles that call. With this pattern the SDK is safe for concurrent use — each instance holds its own session state and submits data independently.

SDK Reference

PipecatTracer Initialization

from cekura.pipecat import PipecatTracer

tracer = PipecatTracer(
    api_key="your_api_key",                     # Required: Your Cekura API key
    agent_id=123,                               # Required: Agent ID from dashboard
    host="https://api.cekura.ai",               # Optional: Custom API host
    enabled=True,                               # Optional: Enable/disable tracer
    otel_endpoint="https://otel.cekura.ai",     # Optional: OTel Collector gRPC endpoint
    enable_otel_traces=True,                    # Optional: Enable OpenTelemetry trace export
    capture_logs=True,                          # Optional: Capture session logs (INFO+)
)

observe_pipeline()

Adds observability to your pipeline — captures transcripts and records audio. For production monitoring.

pipeline = tracer.observe_pipeline(
    pipeline,              # Required: Pipecat Pipeline instance
    context,               # Required: LLMContext instance
    runner_args=None,      # Optional: RunnerArguments (session_id extracted if present)
    session_id=None,       # Optional: Takes precedence over runner_args.session_id
    custom_metadata=None,  # Optional: Dict of custom metadata
    defer_upload=False     # Optional: Hold all data until start_audio_upload() is called
)

Speaker channel assignment is handled automatically — the SDK identifies agent-side and user-side audio from the positions of the LLM aggregators in your Pipecat pipeline. No explicit bot_channel or user_channel parameters are required.

track_pipeline()

Adds simulation-mode tracking to your pipeline — captures transcripts only, no audio. For testing.

pipeline = tracer.track_pipeline(
    pipeline,              # Required: Pipecat Pipeline instance
    context,               # Required: LLMContext instance
    runner_args=None,      # Optional: RunnerArguments (session_id extracted if present)
    session_id=None,       # Optional: Takes precedence over runner_args.session_id
    custom_metadata=None   # Optional: Dict of custom metadata
)

observe_and_create_task()

Single-step observability setup. Combines observe_pipeline(), PipelineTask creation, and register_task_handlers(). Automatically configures OTel tracing when enabled.

task = tracer.observe_and_create_task(
    pipeline,              # Required: Pipecat Pipeline instance
    context,               # Required: LLMContext instance
    runner_args=None,      # Optional: RunnerArguments for session_id
    transport=None,        # Optional: Transport for cleanup on disconnect
    session_id=None,       # Optional: Custom session identifier
    custom_metadata=None,  # Optional: Dict of custom metadata
    defer_upload=False,    # Optional: Hold all data until start_audio_upload() is called
    **pipeline_task_kwargs # Optional: Additional args forwarded to PipelineTask
)

track_and_create_task()

Single-step simulation setup. Combines track_pipeline(), PipelineTask creation, and register_task_handlers(). Automatically configures OTel tracing when enabled.

task = tracer.track_and_create_task(
    pipeline,              # Required: Pipecat Pipeline instance
    context,               # Required: LLMContext instance
    runner_args=None,      # Optional: RunnerArguments for session_id
    transport=None,        # Optional: Transport for cleanup on disconnect
    session_id=None,       # Optional: Custom session identifier
    custom_metadata=None,  # Optional: Dict of custom metadata
    **pipeline_task_kwargs # Optional: Additional args forwarded to PipelineTask
)

register_task_handlers()

Registers cleanup handlers for automatic resource management when the session ends. Only needed when using the two-step approach.

task = tracer.register_task_handlers(
    task,              # Required: PipelineTask instance
    transport=None     # Optional: Transport for cleanup on disconnect
)

start_audio_upload()

Grants consent and begins uploading session data when defer_upload=True. Flushes any buffered audio to S3 and enables normal upload for the rest of the session. No-op if defer_upload was not set.

await tracer.start_audio_upload()

abort()

Stops all capture mid-call and prevents data from being sent to the Cekura backend. Clears any deferred audio buffer and aborts in-progress S3 uploads.

await tracer.abort()

set_custom_metadata()

Update custom metadata at any point during the session.

tracer.set_custom_metadata({"key": "value"})

get_custom_metadata()

Retrieve the current custom metadata.

metadata = tracer.get_custom_metadata()

Environment variables:

CEKURA_TRACER_ENABLED="false": Disable the tracer entirely

Troubleshooting

Missing Aggregators Warning

If you see:

Cekura observability disabled: LLMUserAggregator and LLMAssistantAggregator not found in pipeline.

Your pipeline must include LLMUserAggregator and LLMAssistantAggregator for transcript capture. Add aggregators to your pipeline:

from pipecat.processors.aggregators.llm_response_universal import (
    LLMContextAggregatorPair,
    LLMUserAggregatorParams,
)
from pipecat.audio.vad.silero import SileroVADAnalyzer

user_agg, assistant_agg = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        vad_analyzer=SileroVADAnalyzer()
    )
)

pipeline = Pipeline([
    transport.input(),
    stt,
    user_agg,      # Add user aggregator
    llm,
    tts,
    transport.output(),
    assistant_agg,  # Add assistant aggregator
])

Import Error

If you see:

ImportError: PipecatTracer requires additional dependencies. Install with: pip install cekura[pipecat]

Install the pipecat dependencies:

pip install cekura[pipecat]

Next Steps

Create custom metrics to evaluate your agents
Set up automated testing for your Pipecat agents
Explore predefined metrics

Get Started

Key Concepts

Guides

Integrations

Advanced

Pipecat Tracing

Overview

Video Tutorial

Prerequisites

Setup

Enhanced Data in Cekura UI

OpenTelemetry Tracing

Trace Structure

How It Works

Additional Span Attributes

Disabling OTel Traces

Custom Metadata

Deferred Upload (Compliance)

Best Practices

SDK Reference

PipecatTracer Initialization

observe_pipeline()

track_pipeline()

observe_and_create_task()

track_and_create_task()

register_task_handlers()

start_audio_upload()

abort()

set_custom_metadata()

get_custom_metadata()

Troubleshooting

Missing Aggregators Warning

Import Error

Next Steps

​Overview

​Video Tutorial

​Prerequisites

​Setup

​Enhanced Data in Cekura UI

​OpenTelemetry Tracing

​Trace Structure

​How It Works

​Additional Span Attributes

​Disabling OTel Traces

​Custom Metadata

​Deferred Upload (Compliance)

​Best Practices

​SDK Reference

​PipecatTracer Initialization

​observe_pipeline()

​track_pipeline()

​observe_and_create_task()

​track_and_create_task()

​register_task_handlers()

​start_audio_upload()

​abort()

​set_custom_metadata()

​get_custom_metadata()

​Troubleshooting

​Missing Aggregators Warning

​Import Error

​Next Steps

Overview

Video Tutorial

Prerequisites

Setup

Enhanced Data in Cekura UI

OpenTelemetry Tracing

Trace Structure

How It Works

Additional Span Attributes

Disabling OTel Traces

Custom Metadata

Deferred Upload (Compliance)

Best Practices

SDK Reference

PipecatTracer Initialization

observe_pipeline()

track_pipeline()

observe_and_create_task()

track_and_create_task()

register_task_handlers()

start_audio_upload()

abort()

set_custom_metadata()

get_custom_metadata()

Troubleshooting

Missing Aggregators Warning

Import Error

Next Steps