Quick Start
To enable metrics collection, configure OpenTelemetry:http://localhost:9464/metrics.
MetricsCollector
TheMetricsCollector class subscribes to events from all agent components and records OpenTelemetry metrics automatically. Each Agent automatically creates a MetricsCollector internally, so metrics collection is enabled by default.
If no OpenTelemetry providers are configured, metrics are no-ops and have no performance impact.
The collector listens to events from:
- LLM — Response latency, token usage, tool calls
- STT — Transcription latency, audio duration
- TTS — Synthesis latency, audio duration, characters
- Turn Detection — Turn duration, trailing silence
- Realtime LLM — Session metrics, audio I/O, transcriptions
- VLM — Inference latency, token usage
- Video Processors — Frame processing, detections
Metric Attributes
All metrics include contextual attributes:| Attribute | Description |
|---|---|
provider | The plugin name (e.g., openai, deepgram) |
model | Model identifier when available |
error_type | Exception class name for error metrics |
error_code | Error code when available |
Metrics Reference
All metrics use thevision_agents.core meter namespace.
STT Metrics
| Metric | Type | Unit | Description |
|---|---|---|---|
stt.latency.ms | Histogram | ms | Processing latency for speech-to-text |
stt.audio_duration.ms | Histogram | ms | Duration of audio processed |
stt.errors | Counter | — | Total STT errors |
TTS Metrics
| Metric | Type | Unit | Description |
|---|---|---|---|
tts.latency.ms | Histogram | ms | Synthesis latency |
tts.audio_duration.ms | Histogram | ms | Duration of synthesized audio |
tts.characters | Counter | — | Characters synthesized |
tts.errors | Counter | — | Total TTS errors |
LLM Metrics
| Metric | Type | Unit | Description |
|---|---|---|---|
llm.latency.ms | Histogram | ms | Response latency (request to complete) |
llm.time_to_first_token.ms | Histogram | ms | Time to first token (streaming) |
llm.tokens.input | Counter | — | Input/prompt tokens consumed |
llm.tokens.output | Counter | — | Output/completion tokens generated |
llm.tool_calls | Counter | — | Tool/function calls executed |
llm.tool_latency.ms | Histogram | ms | Tool execution latency |
llm.errors | Counter | — | Total LLM errors |
Turn Detection Metrics
| Metric | Type | Unit | Description |
|---|---|---|---|
turn.duration.ms | Histogram | ms | Duration of detected speech turns |
turn.trailing_silence.ms | Histogram | ms | Silence duration before turn end |
Realtime LLM Metrics
For speech-to-speech models like OpenAI Realtime:| Metric | Type | Unit | Description |
|---|---|---|---|
realtime.sessions | Counter | — | Sessions started |
realtime.session_duration.ms | Histogram | ms | Session duration |
realtime.audio.input.bytes | Counter | bytes | Audio bytes sent to LLM |
realtime.audio.output.bytes | Counter | bytes | Audio bytes received from LLM |
realtime.audio.input.duration.ms | Counter | ms | Audio duration sent |
realtime.audio.output.duration.ms | Counter | ms | Audio duration received |
realtime.responses | Counter | — | Complete responses received |
realtime.transcriptions.user | Counter | — | User speech transcriptions |
realtime.transcriptions.agent | Counter | — | Agent speech transcriptions |
realtime.errors | Counter | — | Realtime errors |
VLM / Vision Metrics
| Metric | Type | Unit | Description |
|---|---|---|---|
vlm.inference.latency.ms | Histogram | ms | VLM inference latency |
vlm.inferences | Counter | — | Inference requests |
vlm.tokens.input | Counter | — | Input tokens (text + image) |
vlm.tokens.output | Counter | — | Output tokens |
vlm.errors | Counter | — | VLM errors |
Video Processor Metrics
| Metric | Type | Unit | Description |
|---|---|---|---|
video.frames.processed | Counter | — | Frames processed |
video.processing.latency.ms | Histogram | ms | Frame processing latency |
video.detections | Counter | — | Objects/items detected |
AgentMetrics
For in-process metrics without external infrastructure, access aggregated metrics directly from the agent:Available AgentMetrics
| Metric | Type | Description |
|---|---|---|
stt_latency_ms__avg | Average | Average STT processing latency |
stt_audio_duration_ms__total | Counter | Total audio duration processed |
tts_latency_ms__avg | Average | Average TTS synthesis latency |
tts_audio_duration_ms__total | Counter | Total synthesized audio duration |
tts_characters__total | Counter | Total characters synthesized |
llm_latency_ms__avg | Average | Average LLM response latency |
llm_time_to_first_token_ms__avg | Average | Average time to first token |
llm_input_tokens__total | Counter | Total input tokens |
llm_output_tokens__total | Counter | Total output tokens |
llm_tool_calls__total | Counter | Total tool calls |
llm_tool_latency_ms__avg | Average | Average tool execution latency |
turn_duration_ms__avg | Average | Average turn duration |
turn_trailing_silence_ms__avg | Average | Average trailing silence |
realtime_audio_input_bytes__total | Counter | Total audio bytes sent |
realtime_audio_output_bytes__total | Counter | Total audio bytes received |
realtime_audio_input_duration_ms__total | Counter | Total input audio duration |
realtime_audio_output_duration_ms__total | Counter | Total output audio duration |
realtime_user_transcriptions__total | Counter | Total user transcriptions |
realtime_agent_transcriptions__total | Counter | Total agent transcriptions |
vlm_inference_latency_ms__avg | Average | Average VLM inference latency |
vlm_inferences__total | Counter | Total VLM inferences |
vlm_input_tokens__total | Counter | Total VLM input tokens |
vlm_output_tokens__total | Counter | Total VLM output tokens |
video_frames_processed__total | Counter | Total frames processed |
video_processing_latency_ms__avg | Average | Average frame processing latency |
Prometheus Setup
Export metrics to Prometheus for monitoring dashboards and alerting. Step 1 — Install the exporterhttp://localhost:9464/metrics.
Tracing with Jaeger
Trace requests across components for debugging latency issues. Step 1 — Install the exporterhttp://localhost:16686.
Complete Example
http://localhost:9464/metrics.
Best Practices
Configure OpenTelemetry — Set up providers to enable metric collection. If no providers are configured, metrics are no-ops. MetricsCollector is automatic — Each Agent automatically creates a MetricsCollector internally. If no OpenTelemetry provider is configured, metrics are no-ops with no performance impact. Use AgentMetrics for simple logging — Accessagent.metrics directly for in-process metrics without external infrastructure.
Add resource attributes — Include service name and environment in your metrics:
Next Steps
- Production Deployment - Docker, Kubernetes, health checks
- Running Agents - Console mode and HTTP server for session management

