Docker Deployment

Deploy Vision Agents to production using Docker. For a complete Kubernetes setup with Helm charts, monitoring, and Grafana dashboards, see the Kubernetes Deployment guide.

Prefer managed hosting? Stream Voice AI runs production voice agents on Stream’s global edge, phone numbers, web and mobile clients, co-located STT/LLM/TTS, and built-in observability. Join the waitlist for early access.

Key Considerations

Factor	Recommendation
Region	US East for lowest latency (most AI providers default here)
CPU vs GPU	CPU for most voice agents; GPU only if running local models
Scaling	Use the HTTP server for multi-session deployments

Docker

Two Dockerfiles are provided: CPU (Dockerfile) - Small, fast to build (~150MB)

FROM python:3.13-slim
WORKDIR /app
RUN pip install uv
COPY pyproject.toml uv.lock agent.py ./
EXPOSE 8080
ENV UV_LINK_MODE=copy
CMD ["sh", "-c", "uv sync --frozen && uv run agent.py serve --host 0.0.0.0 --port 8080"]

GPU (Dockerfile.gpu) - For local model inference (~8GB)

FROM pytorch/pytorch:2.9.1-cuda12.8-cudnn9-runtime
WORKDIR /app
RUN pip install uv
COPY pyproject.toml uv.lock agent.py ./
EXPOSE 8080
ENV UV_LINK_MODE=copy
CMD ["sh", "-c", "uv sync --frozen && uv run agent.py serve --host 0.0.0.0 --port 8080"]

Build for Linux (required for cloud deployment):

docker buildx build --platform linux/amd64 -t vision-agent .

Only use the GPU Dockerfile if running local models (Roboflow, local VLMs). Most voice agents use cloud APIs and don’t need GPUs. Make sure CUDA drivers are installed and the base image matches your CUDA version.

Environment Variables

Create a .env file with your API keys:

STREAM_API_KEY=your_key
STREAM_API_SECRET=your_secret
DEEPGRAM_API_KEY=your_key
ELEVENLABS_API_KEY=your_key
GOOGLE_API_KEY=your_key

For Kubernetes, create a secret:

kubectl create secret generic vision-agent-env --from-env-file=.env

Next Steps

Built-in HTTP Server

API endpoints, session limits, and authentication

Horizontal Scaling

Scale across multiple servers with Redis

Kubernetes Deployment

Helm chart, Prometheus, and Grafana

Telemetry & Metrics

OpenTelemetry, Prometheus, and Jaeger setup

Deploying Agents

Voice

Video

Testing

Tools & Knowledge

Key Considerations

Docker

Environment Variables

Next Steps

Built-in HTTP Server

Horizontal Scaling

Kubernetes Deployment

Telemetry & Metrics

​Key Considerations

​Docker

​Environment Variables

​Next Steps

Built-in HTTP Server

Horizontal Scaling

Kubernetes Deployment

Telemetry & Metrics

Key Considerations

Docker

Environment Variables

Next Steps