Skip to main content
Wizper is a real-time Whisper v3 variant hosted by Fal.ai. Provides accurate STT with on-the-fly translation to 99+ languages.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.

Installation

uv add vision-agents[wizper]

Quick Start

from vision_agents.core import Agent, User
from vision_agents.plugins import wizper, gemini, elevenlabs, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=gemini.LLM("gemini-2.5-flash"),
    stt=wizper.STT(),
    tts=elevenlabs.TTS(),
)
Set FAL_KEY in your environment for Fal.ai authentication.

Parameters

NameTypeDefaultDescription
taskstr"transcribe"Task ("transcribe" or "translate")
target_languagestrNoneISO-639-1 code for translation (e.g., "es", "fr")
sample_rateint48000Audio sample rate in Hz

Translation

Translate speech to any supported language:
# Translate all speech to Spanish
stt = wizper.STT(target_language="es")

Next Steps