- Video analysis — Pose detection, object recognition, scene understanding
- Media transformation — Video effects, avatars, filters
- State injection — Feed detection results or external data to the LLM
Class Hierarchy
All processors inherit from the abstractProcessor base class:
| Class | Purpose |
|---|---|
Processor | Abstract base class with name, close(), and attach_agent() |
VideoProcessor | Receives video tracks via process_video() |
VideoPublisher | Outputs video via publish_video_track() |
VideoProcessorPublisher | Receives and outputs video (e.g., annotated frames) |
AudioProcessor | Receives audio via process_audio() |
AudioPublisher | Outputs audio via publish_audio_track() |
AudioProcessorPublisher | Receives and outputs audio |
Base Processor
All processors must implementname and close(). The attach_agent() method is optional.
Video Processor
Receives video tracks from participants. The agent provides a sharedVideoForwarder that distributes frames to all processors.
Video Publisher
Outputs a video track to the call (e.g., AI-generated video or avatars).Video Processor + Publisher
For processors that receive video and output transformed frames (e.g., object detection with annotations).Audio Processor
Receives audio data from participants. Audio is delivered asPcmData chunks.
Audio Publisher
Outputs an audio track to the call.Usage
Pass processors to the agent at initialization:For complete examples including YOLO pose detection and object detection, see Building Video Processors.

