Skip to main content
Kokoro is a local TTS engine that runs entirely on your machine. No API key or internet connection required. Ideal for offline voice synthesis, privacy-sensitive applications, or prototyping.
Vision Agents requires a Stream account for real-time transport. Most providers offer free tiers to get started.

Installation

uv add vision-agents[kokoro]

Quick Start

from vision_agents.core import Agent, User
from vision_agents.plugins import kokoro, gemini, deepgram, getstream

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="Assistant", id="agent"),
    instructions="You are a helpful assistant.",
    llm=gemini.LLM("gemini-2.5-flash"),
    stt=deepgram.STT(),
    tts=kokoro.TTS(),
)
Kokoro runs locally. No API key or internet connection is required.

Parameters

NameTypeDefaultDescription
voicestr"af_heart"Voice preset
lang_codestr"a"Language code ("a" = American English)
speedfloat1.0Playback speed (e.g., 0.9 slower, 1.2 faster)
devicestrNoneDevice ("cuda", "cpu", or auto-detect)

Next Steps