Control every GPU.
Route every model.

One OpenAI-compatible API across cloud models and your own GPUs. Provision, route, train and observe — from a single control plane.

Open dashboard → Read the docs

What it is

One control plane for AI workloads.

Point the OpenAI SDK at ai-ctrl and reach every provider — or your own vLLM on cloud GPUs you provision on demand. Same API, money-safe failure modes, full observability.

Features

Gateway

Multi-provider routing across OpenAI, Anthropic, Gemini, vLLM and more. Explicit fallback policy, response cache, token-bucket rate limits.

GPU & Pools

On-demand GPUs on RunPod & Vast.ai. Declarative pools, autoscaling, per-pool budgets and money-safety reapers.

Workloads

Chat, Batch at −50%, distributed Training/LoRA, and ComfyUI image/video — one job surface.

Observability

Live SSE event stream, a four-field status contract, ~130 machine states, and an append-only audit log.

Multi-tenancy

Per-account API keys with mint / rotate / revoke, usage tracking and billing.

ai-ctrl MCP

Drive the whole platform from Claude Code — provision, route, inspect, all from your editor.

How it works

Call one API

Send an OpenAI-compatible request to ai-ctrl with the model you want.

It routes

ai-ctrl picks the provider — or your own vLLM on a pooled GPU — by priority and cost.

You observe

Every machine, job and swap reports live status, progress and errors.

Quickstart

# Point any OpenAI client at ai-ctrl
curl https://api.ai-ctrl.net/v1/chat/completions \
  -H "Authorization: Bearer $AI_CTRL_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","messages":[{"role":"user","content":"hi"}]}'

# Or with the OpenAI Python SDK
from openai import OpenAI
client = OpenAI(base_url="https://api.ai-ctrl.net/v1", api_key="$AI_CTRL_API_KEY")

Built to be observable

Every resource reports the same four fields.

Automation routes on stable enums; humans read the detail. No guessing.

status

Small stable enum. Automation fans out on this.

state_detail

Human-readable current activity.

state_progress

0–100% through the current step.

error_code

Stable category for retry/alert switching.

Control every GPU.Route every model.