Deploy Vision Agents to production using Docker. For a complete Kubernetes setup with Helm charts, monitoring, and Grafana dashboards, see the Kubernetes Deployment guide.Documentation Index
Fetch the complete documentation index at: https://visionagents.ai/llms.txt
Use this file to discover all available pages before exploring further.
Key Considerations
| Factor | Recommendation |
|---|---|
| Region | US East for lowest latency (most AI providers default here) |
| CPU vs GPU | CPU for most voice agents; GPU only if running local models |
| Scaling | Use the HTTP server for multi-session deployments |
Docker
Two Dockerfiles are provided: CPU (Dockerfile) - Small, fast to build (~150MB)
Dockerfile.gpu) - For local model inference (~8GB)
Only use the GPU Dockerfile if running local models (Roboflow, local VLMs). Most voice agents use cloud APIs and don’t need GPUs. Make sure CUDA drivers are installed and the base image matches your CUDA version.
Environment Variables
Create a.env file with your API keys:
Next Steps
Built-in HTTP Server
API endpoints, session limits, and authentication
Horizontal Scaling
Scale across multiple servers with Redis
Kubernetes Deployment
Helm chart, Prometheus, and Grafana
Telemetry & Metrics
OpenTelemetry, Prometheus, and Jaeger setup

