Anam Avatars

Anam provides real-time interactive avatar video with automatic lip-sync. Add a video avatar to your agent that speaks with natural movements synchronized to your agent’s voice output.

Vision Agents requires a Stream account for real-time transport. Anam provides API keys and avatar IDs through their dashboard.

Installation

uv add "vision-agents[anam]"

Quick Start

from vision_agents.core import Agent, User
from vision_agents.plugins import gemini, deepgram, getstream
from vision_agents.plugins.anam import AnamAvatarPublisher

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You're a friendly AI assistant.",
    llm=gemini.LLM("gemini-3-flash-preview"),
    tts=deepgram.TTS(),
    stt=deepgram.STT(),
    processors=[AnamAvatarPublisher()],
)

Set ANAM_API_KEY and ANAM_AVATAR_ID in your environment, or pass them directly to AnamAvatarPublisher.

Parameters

Name	Type	Default	Description
`avatar_id`	`str`	`None`	Anam avatar ID (defaults to `ANAM_AVATAR_ID` env var)
`api_key`	`str`	`None`	API key (defaults to `ANAM_API_KEY` env var)
`client_options`	`ClientOptions`	`None`	Advanced Anam client configuration
`connect_timeout`	`float`	`None`	Seconds to wait for connection (`None` = wait indefinitely)
`session_ready_timeout`	`float`	`None`	Seconds to wait for session ready (`None` = wait indefinitely)
`width`	`int`	`1920`	Output video width in pixels (must be positive and even)
`height`	`int`	`1080`	Output video height in pixels (must be positive and even)

How It Works

Agent TTS audio is resampled to 24 kHz mono and streamed to Anam
Anam generates lip-synced avatar video and audio from the input
Avatar video and audio frames are streamed back to call participants via Stream Edge
When a user starts speaking, the avatar is automatically interrupted

With Realtime LLMs Anam also works with realtime speech-to-speech models. It subscribes to both TTS audio events and realtime audio output, so you can swap in a realtime LLM without any changes to the avatar setup.

from vision_agents.plugins import gemini
from vision_agents.plugins.anam import AnamAvatarPublisher

agent = Agent(
    llm=gemini.Realtime(),
    processors=[AnamAvatarPublisher()],
    ...
)

Next Steps

Build a Voice Agent

Get started with voice

Build a Video Agent

Add video processing

HeyGen Avatars

Alternative avatar provider

Overview

Language Models

Realtime

Speech-to-Text

Text-to-Speech

Vision & Video

Avatars

Turn Detection

Infrastructure

Custom Integrations

Installation

Quick Start

Parameters

How It Works

Next Steps

Build a Voice Agent

Build a Video Agent

HeyGen Avatars

Overview

Language Models

Realtime

Speech-to-Text

Text-to-Speech

Vision & Video

Avatars

Turn Detection

Infrastructure

Custom Integrations

​Installation

​Quick Start

​Parameters

​How It Works

​Next Steps

Build a Voice Agent

Build a Video Agent

HeyGen Avatars

Installation

Quick Start

Parameters

How It Works

Next Steps