Skip to main content

Documentation Index

Fetch the complete documentation index at: https://visionagents.ai/llms.txt

Use this file to discover all available pages before exploring further.

Turn Detection identifies when a speaker has finished their conversational turn and it’s appropriate for an AI to respond. It solves a critical problem in voice AI: respond too early and you interrupt the speaker; wait too long and the conversation feels awkward.

How It Works

Turn detection analyzes audio through a multi-stage pipeline:
  1. Voice Activity Detection (VAD): Detects when someone is speaking
  2. Audio Buffering: Collects speech segments for analysis
  3. AI Analysis: Examines speech patterns, content, and context to predict turn completion
  4. Turn Signals: Sends TurnStarted and TurnEnded signals into the pipeline so the agent knows when to interrupt and when to respond
The key insight is distinguishing between “I’m pausing to think” and “I’m done talking”—something simple silence detection can’t do.

Turn Detection vs VAD

VADTurn Detection
Question”Is someone speaking?""Has the speaker finished?”
OutputSpeech start/end timestampsTurnStarted / TurnEnded turn signals
IntelligenceSimple audio analysisConversational context
Best forDetecting presenceKnowing when to respond
Vision Agents’ turn detection uses VAD under the hood, then applies neural models to determine turn completion.

Available Plugins

PluginDescription
Smart TurnCombines Silero VAD, Whisper features, and neural turn completion models
VogentNeural turn detection with high accuracy prediction
Some STT plugins also include built-in turn detection via VAD, which means no separate plugin is needed:
STT PluginTurn Detection
DeepgramBuilt-in with eager_turn_detection option
ElevenLabsBuilt-in via VAD commit strategy
For Realtime APIs (OpenAI, Gemini, AWS Bedrock, Qwen), turn detection is built-in at the model level—no separate plugin needed.
When an STT plugin provides built-in turn detection, the Agent automatically ignores any external TurnDetector plugin to prevent conflicts.

Use Cases

  • Voice Assistants: Respond at the right moment without interrupting
  • Customer Service Bots: Natural conversation flow with customers
  • Real-time Translation: Capture complete thoughts before translating
  • Meeting Intelligence: Identify natural break points for summarization
  • Interview Tools: AI interviewers that don’t interrupt

Next Steps

  • Interruption Handling — How to use turn detection in your agent (setup, tuning, troubleshooting)
  • Smart Turn — Configure the Smart Turn plugin
  • Vogent — Alternative turn detection option