Pricing

28 models across 5 providers. Every rate is exactly 50% off the official published rate from Anthropic, OpenAI, Google, DeepSeek, Qwen, Z.AI, and Moonshot.

Nimbus is prepaid. You add funds to your account balance and every request debits at the rate below. There is no seat license, no monthly commit, no throttle. Prices are USD per 1,000,000 tokens. The crossed number is the vendor's public list rate — the black number is what Nimbus charges. If you want a quick primer, jump to Top up or Models.

Text & chat models

Model ID	Provider	Context	Input / 1M	Output / 1M	Retail
anthropic/claude-opus-4.8	Anthropic	1M	$2.50	$12.50	$5.00 / $25.00
anthropic/claude-opus-4.7	Anthropic	1M	$2.50	$12.50	$5.00 / $25.00
anthropic/claude-opus-4.6	Anthropic	1M	$2.50	$12.50	$5.00 / $25.00
anthropic/claude-sonnet-4.6	Anthropic	1M	$1.50	$7.50	$3.00 / $15.00
anthropic/claude-haiku-4.5	Anthropic	200K	$0.5	$2.50	$1.00 / $5.00
openai/gpt-5.5	OpenAI	1M	$2.50	$15.00	$5.00 / $30.00
openai/gpt-5.4	OpenAI	1M	$1.25	$7.50	$2.50 / $15.00
openai/gpt-5.4-mini	OpenAI	400K	$0.375	$2.25	$0.75 / $4.50
openai/gpt-5.3-codex	OpenAI	400K	$0.875	$7.00	$1.75 / $14.00
openai/gpt-5.1-codex-max	OpenAI	400K	$0.625	$5.00	$1.25 / $10.00
openai/gpt-5.1-codex-mini	OpenAI	400K	$0.125	$1.00	$0.25 / $2.00
google/gemini-3.1-pro-preview	Google	1M	$1.00	$6.00	$2.00 / $12.00
google/gemini-3.5-flash	Google	1M	$0.75	$4.50	$1.50 / $9.00
google/gemini-3-flash-preview	Google	1M	$0.25	$1.50	$0.5 / $3.00
deepseek/deepseek-v4-pro	China	1M	$0.215	$0.435	$0.43 / $0.87
qwen/qwen3-coder	China	1M	$0.11	$0.9	$0.22 / $1.80
qwen/qwen3.7-max	China	1M	$0.625	$1.88	$1.25 / $3.75
z-ai/glm-5	China	200K	$0.3	$0.96	$0.6 / $1.92
z-ai/glm-5.1	China	200K	$0.49	$1.54	$0.98 / $3.08
z-ai/glm-5.2	China	1M	$0.7	$2.20	$1.40 / $4.40
moonshotai/kimi-k2.6	China	256K	$0.34	$1.71	$0.68 / $3.41
moonshotai/kimi-k2.7-code	China	256K	$0.375	$1.75	$0.75 / $3.50

Image models

Image generation is billed per image based on the underlying vendor rate. All image models are priced at 50% off official rates. Contact billing@nimbusapi.net for the per-image cost sheet.

Model ID	Provider	Type
google/gemini-2.5-flash-image	Image	Image generation
google/gemini-3-pro-image	Image	Image generation
google/gemini-3.1-flash-image	Image	Image generation
openai/gpt-5-image	Image	Image generation
openai/gpt-5-image-mini	Image	Image generation
openai/gpt-5.4-image-2	Image	Image generation

How the 50% discount works

Nimbus buys tokens at wholesale under enterprise contracts with each provider (Anthropic direct, OpenAI direct, Google AI Studio + Vertex, and reseller arrangements for the China labs). Wholesale rates are typically 40–60% below published list, and we pass the entire spread to customers by pricing every model at exactly retail ÷ 2. When a vendor lowers the public rate, we lower ours the same day. When a vendor raises the public rate, we keep the -50% ratio.

Volume discount rules

Under $5,000/month: published rate above. No extra discount.
$5,000 – $25,000/month: an extra 10% off any single model on request. Email sales@nimbusapi.net with your model mix and expected volume.
$25,000 – $100,000/month: 15% extra, monthly invoice with net-15 terms available in place of prepaid.
Over $100,000/month: up to 25% extra plus a dedicated capacity guarantee at Anthropic / OpenAI / Google, private Slack channel, and named account engineer.

Billing granularity

Every request debits the account balance immediately after the response completes. Input and output tokens are metered separately using the exact token counts returned by the upstream vendor. Cached input (Anthropic / OpenAI cache-control blocks) is billed at the provider's cached rate divided by two — the discount stacks on the vendor cache discount. Streamed responses charge only for tokens actually delivered. Failed responses (5xx from the upstream) never bill.