Skip to main content

Introduction

Bolna provides detailed latency metrics for every Voice AI execution, helping you monitor and optimize agent response speed. These metrics break down timing across the entire voice pipeline, from speech recognition to LLM processing to audio synthesis. Access latency data via the Get Execution API in the latency_data object.

Latency Data Overview

Top-Level Metrics

{
  "latency_data": {
    "stream_id": 129.56,
    "time_to_first_audio": 130.84,
    "region": "in",
    "transcriber": { ... },
    "llm": { ... },
    "synthesizer": { ... }
  }
}
FieldTypeDescription
stream_idfloatTime (ms) to establish the audio stream connection
time_to_first_audiofloatTotal time (ms) from call start to first audio played to the user
regionstringGeographic region code (e.g., in for India, us for United States)
time_to_first_audio is the most important metric for perceived responsiveness. It represents how long the caller waits before hearing the agent speak.

Pipeline Component Metrics

Converts spoken audio into text. Tracks how quickly speech is being transcribed.
{
  "transcriber": {
    "time_to_connect": 226,
    "turns": [
      {
        "turn": 1,
        "turn_latency": [
          {
            "sequence_id": 1,
            "audio_to_text_latency": 20.12,
            "text": "hello who is there"
          },
          {
            "sequence_id": 2,
            "audio_to_text_latency": 19.96,
            "text": "hello who is this"
          }
        ]
      }
    ]
  }
}
FieldTypeDescription
time_to_connectintegerTime (ms) to establish connection with the transcriber
turnintegerSequential conversation turn number (starting at 1)
sequence_idintegerIncremental transcription update ID within a turn
audio_to_text_latencyfloatTime (ms) from audio input to transcribed text
textstringTranscribed text for this sequence
Multiple sequences per turn represent incremental refinements. The transcriber provides partial results that improve over time. The final sequence is the most accurate.

Identifying Bottlenecks

Use these thresholds to pinpoint performance issues across the pipeline:
Possible causes:
  • Network issues with the transcription service
  • Need for a different transcription provider
  • Poor audio quality or background noise
Fix: Try a different transcriber provider in your Audio Tab configuration, or improve audio input quality.
Possible causes:
  • LLM model is too large or complex
  • Prompts need optimization (too long or unstructured)
  • High load on the LLM service
Fix: Consider a faster LLM model, optimize your prompt length, or try a different provider in your LLM Tab.
Possible causes:
  • Network issues with the TTS service
  • Voice model is computationally expensive
  • Provider experiencing high load
Fix: Try a different voice or TTS provider in your Audio Tab configuration.

Get Execution API

Retrieve execution details with latency data

Call Status List

Track the full lifecycle of your calls

Hangup Status Codes

Understand call termination reasons