Wizper is a real-time Whisper v3 variant hosted by Fal.ai. Provides accurate STT with on-the-fly translation to 99+ languages.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.
Installation
uv add vision-agents[wizper]
Quick Start
from vision_agents.core import Agent, User
from vision_agents.plugins import wizper, gemini, elevenlabs, getstream
agent = Agent(
edge=getstream.Edge(),
agent_user=User(name="Assistant", id="agent"),
instructions="You are a helpful assistant.",
llm=gemini.LLM("gemini-2.5-flash"),
stt=wizper.STT(),
tts=elevenlabs.TTS(),
)
Set FAL_KEY in your environment for Fal.ai authentication.
Parameters
| Name | Type | Default | Description |
|---|
task | str | "transcribe" | Task ("transcribe" or "translate") |
target_language | str | None | ISO-639-1 code for translation (e.g., "es", "fr") |
sample_rate | int | 48000 | Audio sample rate in Hz |
Translation
Translate speech to any supported language:
# Translate all speech to Spanish
stt = wizper.STT(target_language="es")
Next Steps