Gateway
Multi-provider routing across OpenAI, Anthropic, Gemini, vLLM and more. Explicit fallback policy, response cache, token-bucket rate limits.
One OpenAI-compatible API across cloud models and your own GPUs. Provision, route, train and observe — from a single control plane.
Point the OpenAI SDK at ai-ctrl and reach every provider — or your own vLLM on cloud GPUs you provision on demand. Same API, money-safe failure modes, full observability.
Multi-provider routing across OpenAI, Anthropic, Gemini, vLLM and more. Explicit fallback policy, response cache, token-bucket rate limits.
On-demand GPUs on RunPod & Vast.ai. Declarative pools, autoscaling, per-pool budgets and money-safety reapers.
Chat, Batch at −50%, distributed Training/LoRA, and ComfyUI image/video — one job surface.
Live SSE event stream, a four-field status contract, ~130 machine states, and an append-only audit log.
Per-account API keys with mint / rotate / revoke, usage tracking and billing.
Drive the whole platform from Claude Code — provision, route, inspect, all from your editor.
Send an OpenAI-compatible request to ai-ctrl with the model you want.
ai-ctrl picks the provider — or your own vLLM on a pooled GPU — by priority and cost.
Every machine, job and swap reports live status, progress and errors.
# Point any OpenAI client at ai-ctrl
curl https://api.ai-ctrl.net/v1/chat/completions \
-H "Authorization: Bearer $AI_CTRL_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'
# Or with the OpenAI Python SDK
from openai import OpenAI
client = OpenAI(base_url="https://api.ai-ctrl.net/v1", api_key="$AI_CTRL_API_KEY")
Automation routes on stable enums; humans read the detail. No guessing.
statusSmall stable enum. Automation fans out on this.
state_detailHuman-readable current activity.
state_progress0–100% through the current step.
error_codeStable category for retry/alert switching.