Skip to main content
Complete reference of all events available in Vision Agents. Events are emitted by components during agent execution and can be subscribed to using the @agent.events.subscribe decorator.
For usage patterns and examples, see the Event System Guide.

Base Event Structure

All events inherit from BaseEvent and share these common fields:
FieldTypeDescription
typestrEvent type identifier (e.g., "plugin.stt_transcript")
event_idstrUnique UUID for this event instance
timestampdatetimeWhen the event was created (UTC)
session_id`strNone`Current session identifier
participant`ParticipantNone`Participant metadata from the call
Plugin events extend PluginBaseEvent which adds:
FieldTypeDescription
plugin_name`strNone`Name of the plugin that emitted the event
plugin_version`strNone`Version of the plugin

Call Session Events

Events for participant activity on calls. These come from the Stream Video SDK. Import: from vision_agents.core.events import ...

CallSessionParticipantJoinedEvent

Emitted when a participant joins the call.
from vision_agents.core.events import CallSessionParticipantJoinedEvent

@agent.events.subscribe
async def on_join(event: CallSessionParticipantJoinedEvent):
    user = event.participant.user
    print(f"{user.name} joined (id: {user.id})")
FieldTypeDescription
call_cidstrCall channel ID
session_idstrSession identifier
participantCallParticipantResponseJoined participant info

CallSessionParticipantLeftEvent

Emitted when a participant leaves the call.
from vision_agents.core.events import CallSessionParticipantLeftEvent

@agent.events.subscribe
async def on_leave(event: CallSessionParticipantLeftEvent):
    print(f"{event.participant.user.name} left")
    print(f"Duration: {event.duration_seconds}s")
FieldTypeDescription
call_cidstrCall channel ID
session_idstrSession identifier
participantCallParticipantResponseLeft participant info
duration_secondsintHow long participant was in call
reason`strNone`Why they left

Other Call Events

EventDescription
CallCreatedEventCall was created
CallEndedEventCall ended
CallSessionStartedEventSession started
CallSessionEndedEventSession ended
CallSessionParticipantCountsUpdatedEventParticipant count changed
CallUpdatedEventCall settings updated
CallMemberAddedEventMember added to call
CallMemberRemovedEventMember removed from call
CallRecordingStartedEventRecording started
CallRecordingStoppedEventRecording stopped
CallTranscriptionStartedEventTranscription started
CallTranscriptionStoppedEventTranscription stopped
ClosedCaptionEventClosed caption received

Speech-to-Text (STT) Events

Events from speech recognition. Import: from vision_agents.core.stt.events import ...

STTTranscriptEvent

Emitted when a complete transcript is available.
from vision_agents.core.stt.events import STTTranscriptEvent

@agent.events.subscribe
async def on_transcript(event: STTTranscriptEvent):
    print(f"Text: {event.text}")
    print(f"Confidence: {event.confidence}")
    print(f"Language: {event.language}")
FieldTypeDescription
textstrTranscribed text (required, non-empty)
confidence`floatNone`Recognition confidence (0.0-1.0)
language`strNone`Detected language code
processing_time_ms`floatNone`Time to process audio
audio_duration_ms`floatNone`Duration of audio processed
model_name`strNone`Model used for recognition

STTPartialTranscriptEvent

Emitted during speech for interim results.
FieldTypeDescription
textstrPartial transcribed text
confidence`floatNone`Recognition confidence
language`strNone`Detected language

STTErrorEvent

Emitted when STT encounters an error.
from vision_agents.core.stt.events import STTErrorEvent

@agent.events.subscribe
async def on_stt_error(event: STTErrorEvent):
    print(f"Error: {event.error_message}")
    print(f"Recoverable: {event.is_recoverable}")
FieldTypeDescription
error`ExceptionNone`The exception that occurred
error_code`strNone`Error code identifier
context`strNone`Additional context
retry_countintNumber of retry attempts
is_recoverableboolWhether the error is recoverable
error_messagestrProperty: human-readable error message

STTConnectionEvent

Emitted when STT connection state changes.
FieldTypeDescription
connection_stateConnectionStateNew state (CONNECTED, DISCONNECTED, RECONNECTING, ERROR)
provider`strNone`STT provider name
details`dictNone`Additional connection details
reconnect_attemptsintNumber of reconnection attempts

Text-to-Speech (TTS) Events

Events from speech synthesis. Import: from vision_agents.core.tts.events import ...

TTSAudioEvent

Emitted when TTS audio data is available.
from vision_agents.core.tts.events import TTSAudioEvent

@agent.events.subscribe
async def on_audio(event: TTSAudioEvent):
    print(f"Chunk {event.chunk_index}, final: {event.is_final_chunk}")
FieldTypeDescription
data`PcmDataNone`Audio data
chunk_indexintIndex of this chunk
is_final_chunkboolWhether this is the last chunk
text_source`strNone`Original text being synthesized
synthesis_id`strNone`Unique ID for this synthesis
epochintInterruption epoch counter. Increments each time tts.interrupt() is called. Compare against tts.epoch to detect stale audio events emitted before an interruption.

TTSSynthesisStartEvent

Emitted when TTS synthesis begins.
FieldTypeDescription
text`strNone`Text being synthesized
synthesis_idstrUnique ID for this synthesis
model_name`strNone`TTS model name
voice_id`strNone`Voice identifier
estimated_duration_ms`floatNone`Estimated audio duration

TTSSynthesisCompleteEvent

Emitted when TTS synthesis finishes.
from vision_agents.core.tts.events import TTSSynthesisCompleteEvent

@agent.events.subscribe
async def on_complete(event: TTSSynthesisCompleteEvent):
    print(f"Synthesis took {event.synthesis_time_ms}ms")
    print(f"Audio duration: {event.audio_duration_ms}ms")
FieldTypeDescription
synthesis_id`strNone`Unique ID for this synthesis
text`strNone`Text that was synthesized
total_audio_bytesintTotal bytes of audio
synthesis_time_msfloatProcessing time
audio_duration_ms`floatNone`Resulting audio duration
chunk_countintNumber of chunks produced
real_time_factor`floatNone`Synthesis speed vs real-time

TTSErrorEvent

Emitted when TTS encounters an error.
FieldTypeDescription
error`ExceptionNone`The exception that occurred
error_code`strNone`Error code identifier
context`strNone`Additional context
text_source`strNone`Text being synthesized
synthesis_id`strNone`Synthesis identifier
is_recoverableboolWhether the error is recoverable

TTSConnectionEvent

Emitted when TTS connection state changes.
FieldTypeDescription
connection_stateConnectionStateNew connection state
provider`strNone`TTS provider name
details`dictNone`Additional details

LLM Events

Events from language model interactions. Import: from vision_agents.core.llm.events import ...

LLMResponseCompletedEvent

Emitted when the LLM finishes a response.
from vision_agents.core.llm.events import LLMResponseCompletedEvent

@agent.events.subscribe
async def on_response(event: LLMResponseCompletedEvent):
    print(f"Response: {event.text}")
    print(f"Model: {event.model}")
    print(f"Tokens: {event.input_tokens} in, {event.output_tokens} out")
    print(f"Latency: {event.latency_ms}ms")
FieldTypeDescription
textstrComplete response text
originalAnyRaw response from provider
item_id`strNone`Response item identifier
latency_ms`floatNone`Total request to response time
time_to_first_token_ms`floatNone`Time to first token (streaming)
input_tokens`intNone`Input/prompt tokens used
output_tokens`intNone`Output tokens generated
total_tokens`intNone`Total tokens used
model`strNone`Model identifier

LLMResponseChunkEvent

Emitted for each chunk during streaming responses.
FieldTypeDescription
delta`strNone`Text delta for this chunk
content_index`intNone`Index of content part
item_id`strNone`Response item identifier
output_index`intNone`Output index
sequence_number`intNone`Sequence number
is_first_chunkboolWhether this is the first chunk
time_to_first_token_ms`floatNone`Time to this first chunk

LLMRequestStartedEvent

Emitted when an LLM request begins.
FieldTypeDescription
request_idstrUnique request identifier
model`strNone`Model being used
streamingboolWhether streaming is enabled

LLMErrorEvent

Emitted when a non-realtime LLM error occurs.
FieldTypeDescription
error`ExceptionNone`The exception
error_code`strNone`Error code
context`strNone`Additional context
request_id`strNone`Request identifier
is_recoverableboolWhether error is recoverable

Realtime LLM Events

Events specific to realtime LLM connections (like OpenAI Realtime API). Import: from vision_agents.core.llm.events import ...

RealtimeConnectedEvent

Emitted when realtime connection is established.
FieldTypeDescription
provider`strNone`Provider name
session_id`strNone`Session identifier
session_config`dictNone`Session configuration
capabilities`list[str]None`Available capabilities

RealtimeDisconnectedEvent

Emitted when realtime connection closes.
FieldTypeDescription
provider`strNone`Provider name
session_id`strNone`Session identifier
reason`strNone`Disconnection reason
was_cleanboolWhether disconnect was clean

RealtimeUserSpeechTranscriptionEvent

Emitted when user speech is transcribed by the realtime API.
from vision_agents.core.llm.events import RealtimeUserSpeechTranscriptionEvent

@agent.events.subscribe
async def on_user_speech(event: RealtimeUserSpeechTranscriptionEvent):
    print(f"User said: {event.text}")
FieldTypeDescription
textstrTranscribed user speech
originalAnyRaw event from provider

RealtimeAgentSpeechTranscriptionEvent

Emitted when agent speech is transcribed by the realtime API.
from vision_agents.core.llm.events import RealtimeAgentSpeechTranscriptionEvent

@agent.events.subscribe
async def on_agent_speech(event: RealtimeAgentSpeechTranscriptionEvent):
    print(f"Agent said: {event.text}")
FieldTypeDescription
textstrTranscribed agent speech
originalAnyRaw event from provider

RealtimeAudioInputEvent

Emitted when audio is sent to the realtime session.
FieldTypeDescription
data`PcmDataNone`Audio data sent

RealtimeAudioOutputEvent

Emitted when audio is received from the realtime session.
FieldTypeDescription
data`PcmDataNone`Audio data received
response_id`strNone`Response identifier
epochintInterruption epoch counter. Increments on interruption so stale audio output events from a previous response can be identified and dropped.

RealtimeResponseEvent

Emitted when the realtime session provides a response.
FieldTypeDescription
text`strNone`Response text
original`strNone`Raw response
response_idstrResponse identifier
is_completeboolWhether response is complete
conversation_item_id`strNone`Conversation item ID

RealtimeConversationItemEvent

Emitted for conversation item updates.
FieldTypeDescription
item_id`strNone`Item identifier
item_type`strNone`Type: "message", "function_call", "function_call_output"
status`strNone`Status: "completed", "in_progress", "incomplete"
role`strNone`Role: "user", "assistant", "system"
content`list[dict]None`Item content

RealtimeErrorEvent

Emitted when a realtime error occurs.
FieldTypeDescription
error`ExceptionNone`The exception
error_code`strNone`Error code
context`strNone`Additional context
is_recoverableboolWhether error is recoverable

Tool Events

Events for function calling / tool use. Import: from vision_agents.core.llm.events import ...

ToolStartEvent

Emitted when tool execution begins.
from vision_agents.core.llm.events import ToolStartEvent

@agent.events.subscribe
async def on_tool_start(event: ToolStartEvent):
    print(f"Calling {event.tool_name}")
    print(f"Args: {event.arguments}")
FieldTypeDescription
tool_namestrName of the tool being called
arguments`dictNone`Arguments passed to the tool
tool_call_id`strNone`Unique call identifier

ToolEndEvent

Emitted when tool execution completes.
from vision_agents.core.llm.events import ToolEndEvent

@agent.events.subscribe
async def on_tool_end(event: ToolEndEvent):
    if event.success:
        print(f"{event.tool_name} returned: {event.result}")
        print(f"Took {event.execution_time_ms}ms")
    else:
        print(f"{event.tool_name} failed: {event.error}")
FieldTypeDescription
tool_namestrName of the tool
successboolWhether execution succeeded
resultAnyReturn value (if success)
error`strNone`Error message (if failed)
tool_call_id`strNone`Unique call identifier
execution_time_ms`floatNone`Execution duration

VLM Events

Events for vision/multimodal language models. Import: from vision_agents.core.llm.events import ...

VLMInferenceStartEvent

Emitted when a VLM (Vision Language Model) inference starts. Event Type: plugin.vlm_inference_start
FieldTypeDescription
inference_idstrUnique identifier for this inference
model`strNone`Model identifier
frames_countintNumber of frames to process

VLMInferenceCompletedEvent

Emitted when a VLM inference completes. Contains timing metrics, token usage, and detection counts. Event Type: plugin.vlm_inference_completed
from vision_agents.core.llm.events import VLMInferenceCompletedEvent

@agent.events.subscribe
async def on_vlm_complete(event: VLMInferenceCompletedEvent):
    print(f"VLM response: {event.text}")
    print(f"Processed {event.frames_processed} frames")
    print(f"Detected {event.detections} objects")
FieldTypeDescription
inference_id`strNone`Unique identifier for this inference
model`strNone`Model identifier
textstrGenerated text response
latency_ms`floatNone`Total time from request to complete response
input_tokens`intNone`Number of input tokens (text + image tokens)
output_tokens`intNone`Number of output tokens generated
frames_processedintNumber of video frames processed
detectionsintNumber of objects/items detected
This event is used by MetricsCollector to record VLM metrics. See Telemetry for details.

VLMErrorEvent

Emitted when a VLM error occurs. Event Type: plugin.vlm_error
FieldTypeDescription
error`ExceptionNone`The exception that occurred
error_code`strNone`Error code if available
context`strNone`Additional context about the error
inference_id`strNone`ID of the failed inference
is_recoverableboolWhether the error is recoverable

Video Processor Events

Events from video processing plugins (roboflow, ultralytics, etc.). Import: from vision_agents.core.events import VideoProcessorDetectionEvent

VideoProcessorDetectionEvent

Emitted when a video processor detects objects in a frame.
from vision_agents.core.events import VideoProcessorDetectionEvent

@agent.events.subscribe
async def on_detection(event: VideoProcessorDetectionEvent):
    print(f"Detected {event.detection_count} objects")
    print(f"Inference took {event.inference_time_ms}ms")
FieldTypeDescription
model_id`strNone`Identifier of the model used
inference_time_ms`floatNone`Time taken for inference
detection_countintNumber of objects detected
This event is used by MetricsCollector to record video processing metrics. See Telemetry for details.

OpenAI Plugin Events

Events specific to the OpenAI plugin. Import: from vision_agents.plugins.openai.events import ...

OpenAIStreamEvent

Emitted when OpenAI provides a streaming chunk.
FieldTypeDescription
chunkAnyRaw streaming chunk from OpenAI

VAD Events

Voice Activity Detection events. Import: from vision_agents.core.vad.events import ...

VADSpeechStartEvent

Emitted when VAD detects the start of speech.
FieldTypeDescription
timestampdatetimeWhen speech started

VADSpeechEndEvent

Emitted when VAD detects the end of speech.
FieldTypeDescription
timestampdatetimeWhen speech ended
duration_ms`floatNone`Duration of speech segment

VADErrorEvent

Emitted when VAD encounters an error.
FieldTypeDescription
error`ExceptionNone`The exception that occurred
error_code`strNone`Error code if available
context`strNone`Additional context

Turn Detection Events

Events for detecting when speakers start and stop talking. Import: from vision_agents.core.turn_detection.events import ...

TurnStartedEvent

Emitted when a speaker starts their turn. Event Type: plugin.turn_detection.turn_started
from vision_agents.core.turn_detection.events import TurnStartedEvent

@agent.events.subscribe
async def on_turn_start(event: TurnStartedEvent):
    print(f"Turn started (confidence: {event.confidence})")
FieldTypeDescription
participant`ParticipantNone`Who started speaking
participant_id`strNone`ID of the participant speaking
confidence`floatNone`Detection confidence (0.0-1.0)
custom`dictNone`Additional metadata

TurnEndedEvent

Emitted when a speaker completes their turn. Event Type: plugin.turn_detection.turn_ended
from vision_agents.core.turn_detection.events import TurnEndedEvent

@agent.events.subscribe
async def on_turn_end(event: TurnEndedEvent):
    print(f"Turn ended after {event.duration_ms}ms")
    print(f"Silence: {event.trailing_silence_ms}ms")
FieldTypeDescription
participant`ParticipantNone`Who stopped speaking
participant_id`strNone`ID of the participant
confidence`floatNone`Detection confidence
duration_ms`floatNone`Duration of the turn in milliseconds
trailing_silence_ms`floatNone`Silence duration before turn end
custom`dictNone`Additional metadata
eager_end_of_turnboolEarly end detection flag
This event is used by MetricsCollector to record turn detection metrics. See Telemetry for details.

xAI Plugin Events

Events specific to the xAI plugin. Import: from vision_agents.plugins.xai.events import ...

XAIChunkEvent

Emitted for xAI streaming response chunks.
FieldTypeDescription
chunkAnyRaw streaming chunk from xAI

Qwen Plugin Events

Events specific to the Qwen plugin. Import: from vision_agents.plugins.qwen.events import ...

QwenLLMErrorEvent

Emitted when Qwen LLM encounters an error.
FieldTypeDescription
error`ExceptionNone`The exception that occurred
error_code`strNone`Error code if available
context`strNone`Additional context

ConnectionState Enum

Used in connection events to indicate state. Import: from vision_agents.core.events import ConnectionState
ValueDescription
DISCONNECTEDNot connected
CONNECTINGConnection in progress
CONNECTEDSuccessfully connected
RECONNECTINGAttempting to reconnect
ERRORConnection error

Subscribing to Events

All events can be subscribed to using the @agent.events.subscribe decorator:
@agent.events.subscribe
async def my_handler(event: EventType):
    # Handle event
    pass
Subscribe to multiple event types using union types:
@agent.events.subscribe
async def my_handler(event: STTTranscriptEvent | STTPartialTranscriptEvent):
    print(f"Transcript: {event.text}")
Event handlers must be async functions. Non-async handlers will raise a RuntimeError.