v2.0 · one key · every frontier model

One API.Every frontiermodel.

Nimbus routes OpenAI, Anthropic, Google and 12 more frontier providers behind one OpenAI-compatible endpoint. One key, one bill, up to 50% below market.

Get an API key Read the docs

4.9 · loved by 12,400+ developers

SCROLL — six reasons we win

// WORKS WITH — 24 MODELS · 8 PROVIDERS · ONE ENDPOINT

OpenAIAnthropicGoogle GeminiDeepSeekQwenZ-AIMoonshotVS CodeClaude CodeOpenCodeClineCursorAI AgentsOpenAIAnthropicGoogle GeminiDeepSeekQwenZ-AIMoonshotVS CodeClaude CodeOpenCodeClineCursorAI AgentsOpenAIAnthropicGoogle GeminiDeepSeekQwenZ-AIMoonshotVS CodeClaude CodeOpenCodeClineCursorAI Agents

// 02 — BUILT FOR SHIPPERS

Six small superpowers that compound.

One key. One bill. Zero retention. Ship the AI features today, not next quarter.

SDK compatible

Drop-in replacement for OpenAI and Anthropic SDKs. Change the base URL, keep your code.

Learn more

Prepaid balance

Top up any amount. No subscriptions, no monthly fees. Unused balance never expires.

Learn more

Cost analytics

Per-model, per-key dashboards. Export CSV. Set hard spend caps before scaling.

Learn more

No hard limits

Burst as hard as you need. Capacity scales with your balance, not a plan tier.

Learn more

Zero retention

Prompts stream through and vanish. Only token counts persist — for your invoice.

Learn more

All providers

OpenAI, Anthropic, Google, DeepSeek, Qwen, Z-AI, Moonshot. One key, one bill, one endpoint.

Learn more

// 03 — PRICING

Frontier models. One bill. Up to 50% cheaper.

anthropic1M-50%

Claude Opus 4.8

anthropic/claude-opus-4.8

Input

$2.5$5.00per 1M tokens

Output

$12.5$25.00per 1M tokens

anthropic1M-50%

Claude Opus 4.7

anthropic/claude-opus-4.7

Top-tier reasoning. Best for complex agents and long-context work.

Input

$2.5$5.00per 1M tokens

Output

$12.5$25.00per 1M tokens

anthropic1M-50%

Claude Opus 4.6

anthropic/claude-opus-4.6

Previous-gen Opus. Proven reasoning at the same price.

Input

$2.5$5.00per 1M tokens

Output

$12.5$25.00per 1M tokens

anthropic1M-50%

Claude Sonnet 4.6

anthropic/claude-sonnet-4.6

Balanced speed and intelligence. Great default for chat and tools.

Input

$1.5$3.00per 1M tokens

Output

$7.5$15.00per 1M tokens

anthropic200K-50%

Claude Haiku 4.5

anthropic/claude-haiku-4.5

Lowest latency. Built for high-throughput, real-time tasks.

Input

$0.5$1.00per 1M tokens

Output

$2.5$5.00per 1M tokens

openai1M-50%

GPT-5.5

openai/gpt-5.5

Most-used flagship. Strong general intelligence with 1M context.

Input

$2.5$5.00per 1M tokens

Output

$15.00$30.00per 1M tokens

openai1M-50%

GPT-5.4

openai/gpt-5.4

Best price-to-quality. Workhorse for high-volume pipelines.

Input

$1.25$2.5per 1M tokens

Output

$7.5$15.00per 1M tokens

openai400K-50%

GPT-5.4 mini

openai/gpt-5.4-mini

Compact GPT-5.4. Cheaper variant with the same skills, lower latency.

Input

$0.375$0.75per 1M tokens

Output

$2.25$4.5per 1M tokens

openai400K-50%

GPT-5.3 Codex

openai/gpt-5.3-codex

Code-tuned GPT-5.3. Best for agents, refactors and toolcalls.

Input

$0.875$1.75per 1M tokens

Output

$7.00$14.00per 1M tokens

openai400K-50%

GPT-5.1 Codex Max

openai/gpt-5.1-codex-max

Long-context code model. Built for repo-scale refactors.

Input

$0.625$1.25per 1M tokens

Output

$5.00$10.00per 1M tokens

openai400K-50%

GPT-5.1 Codex mini

openai/gpt-5.1-codex-mini

Tiny code model. Cheap autocomplete and quick agent loops.

Input

$0.125$0.25per 1M tokens

Output

$1.00$2.00per 1M tokens

google1M-50%

Gemini 3.1 Pro Preview

google/gemini-3.1-pro-preview

Input

$1.00$2.00per 1M tokens

Output

$6.00$12.00per 1M tokens

google1M-50%

Gemini Flash 3.5

google/gemini-3.5-flash

Fast, cheap Gemini. Great default for high-volume Google workloads.

Input

$0.75$1.5per 1M tokens

Output

$4.5$9.00per 1M tokens

google1M-50%

Gemini 3 Flash Preview

google/gemini-3-flash-preview

Input

$0.25$0.5per 1M tokens

Output

$1.5$3.00per 1M tokens

china1M-50%

DeepSeek V4 Pro

deepseek/deepseek-v4-pro

Flagship DeepSeek. 1M context, strong reasoning at a fraction of the price.

Input

$0.215$0.43per 1M tokens

Output

$0.435$0.87per 1M tokens

china1M-50%

Qwen3 Coder

qwen/qwen3-coder

Code-specialized Qwen. 1M context, built for agentic coding workflows.

Input

$0.11$0.22per 1M tokens

Output

$0.9$1.8per 1M tokens

china1M-50%

Qwen3.7 Max

qwen/qwen3.7-max

Input

$0.625$1.25per 1M tokens

Output

$1.875$3.75per 1M tokens

china200K-50%

GLM-5

z-ai/glm-5

Latest Z.AI flagship. Strong reasoning and tool use for agents.

Input

$0.3$0.6per 1M tokens

Output

$0.96$1.92per 1M tokens

china200K-50%

GLM 5.1

z-ai/glm-5.1

Input

$0.49$0.98per 1M tokens

Output

$1.54$3.08per 1M tokens

china1M-50%

GLM 5.2

z-ai/glm-5.2

Input

$0.7$1.4per 1M tokens

Output

$2.2$4.4per 1M tokens

china256K-50%

Kimi K2.6

moonshotai/kimi-k2.6

Moonshot agentic model with vision. Long context, great for tools.

Input

$0.34$0.68per 1M tokens

Output

$1.705$3.41per 1M tokens

china256K-50%

Kimi K2.7 Code

moonshotai/kimi-k2.7-code

Input

$0.375$0.75per 1M tokens

Output

$1.75$3.5per 1M tokens

image—