Models
28 models on Nimbus, grouped by provider. Every model streams, most call tools, most see images. Every price is exactly 50% off the vendor's list rate.
Full price table is on the Pricing page. This page is for choosing a model — what each one is good at, and what capabilities it supports. All models share the same OpenAI-compatible endpoint at https://llm.nimbusapi.net/v1. Anthropic-format is also available at /anthropic/v1 for Claude models.
Anthropic
Claude family. Strongest at long-context reasoning, agent loops, and tool use. Best default for coding assistants and multi-turn agents.
anthropic/claude-opus-4.8
Claude Opus 4.8
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
1M
Best for
Frontier reasoning, autonomous agents, long-horizon planning
Input / 1M
$2.50
Output / 1M
$12.50
anthropic/claude-opus-4.7
Claude Opus 4.7
Top-tier reasoning. Best for complex agents and long-context work.
Context
1M
Best for
Complex agents, deep research, long-context code review
Input / 1M
$2.50
Output / 1M
$12.50
anthropic/claude-opus-4.6
Claude Opus 4.6
Previous-gen Opus. Proven reasoning at the same price.
Context
1M
Best for
Same reasoning at proven stability. Good for pinned production
Input / 1M
$2.50
Output / 1M
$12.50
anthropic/claude-sonnet-4.6
Claude Sonnet 4.6
Balanced speed and intelligence. Great default for chat and tools.
Context
1M
Best for
Balanced default: chat, tool use, moderate reasoning
Input / 1M
$1.50
Output / 1M
$7.50
anthropic/claude-haiku-4.5
Claude Haiku 4.5
Lowest latency. Built for high-throughput, real-time tasks.
Context
200K
Best for
High-throughput classification, real-time chat, cheap fallback
Input / 1M
$0.5
Output / 1M
$2.50
OpenAI
GPT-5 family. Broadest general skills, best-in-class function calling, and the Codex line for code-specific workloads.
openai/gpt-5.5
GPT-5.5
Most-used flagship. Strong general intelligence with 1M context.
Context
1M
Best for
General flagship: assistants, agents, structured output
Input / 1M
$2.50
Output / 1M
$15.00
openai/gpt-5.4
GPT-5.4
Best price-to-quality. Workhorse for high-volume pipelines.
Context
1M
Best for
Best $/quality workhorse. Pipelines and tool-calling backends
Input / 1M
$1.25
Output / 1M
$7.50
openai/gpt-5.4-mini
GPT-5.4 mini
Compact GPT-5.4. Cheaper variant with the same skills, lower latency.
Context
400K
Best for
Cheap GPT-5.4 for high-volume tasks with the same skills
Input / 1M
$0.375
Output / 1M
$2.25
openai/gpt-5.3-codex
GPT-5.3 Codex
Code-tuned GPT-5.3. Best for agents, refactors and toolcalls.
Context
400K
Best for
Coding agents, refactors, function calls, IDE integrations
Input / 1M
$0.875
Output / 1M
$7.00
openai/gpt-5.1-codex-max
GPT-5.1 Codex Max
Long-context code model. Built for repo-scale refactors.
Context
400K
Best for
Long-context code work — whole-repo refactors, PR reviews
Input / 1M
$0.625
Output / 1M
$5.00
openai/gpt-5.1-codex-mini
GPT-5.1 Codex mini
Tiny code model. Cheap autocomplete and quick agent loops.
Context
400K
Best for
Inline autocomplete, cheap agent loops, quick refactors
Input / 1M
$0.125
Output / 1M
$1.00
Gemini family. Cheapest 1M-context model on the market (Flash 3 Preview) and strong multimodal grounding on Pro.
google/gemini-3.1-pro-preview
Gemini 3.1 Pro Preview
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
1M
Best for
1M-context multimodal reasoning, PDF + image + video input
Input / 1M
$1.00
Output / 1M
$6.00
google/gemini-3.5-flash
Gemini Flash 3.5
Fast, cheap Gemini. Great default for high-volume Google workloads.
Context
1M
Best for
Cheap fast Gemini, high-volume Google Workspace integrations
Input / 1M
$0.75
Output / 1M
$4.50
google/gemini-3-flash-preview
Gemini 3 Flash Preview
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
1M
Best for
Cheapest 1M-context model on Nimbus. Bulk classification
Input / 1M
$0.25
Output / 1M
$1.50
China labs
DeepSeek, Qwen, Z.AI, Moonshot. Frontier quality at 5–15% of Western prices. Best value for high-volume pipelines that don't need Anthropic-tier reasoning.
deepseek/deepseek-v4-pro
DeepSeek V4 Pro
Flagship DeepSeek. 1M context, strong reasoning at a fraction of the price.
Context
1M
Best for
Frontier reasoning at $0.22 / $0.44 per 1M. Bulk pipelines
Input / 1M
$0.215
Output / 1M
$0.435
qwen/qwen3-coder
Qwen3 Coder
Code-specialized Qwen. 1M context, built for agentic coding workflows.
Context
1M
Best for
Agentic coding at 1M context. Best cheap coding model
Input / 1M
$0.11
Output / 1M
$0.9
qwen/qwen3.7-max
Qwen3.7 Max
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
1M
Best for
Top-tier Qwen flagship. Long-horizon reasoning tasks
Input / 1M
$0.625
Output / 1M
$1.88
z-ai/glm-5
GLM-5
Latest Z.AI flagship. Strong reasoning and tool use for agents.
Context
200K
Best for
Z.AI flagship. Strong tool use for agentic workloads
Input / 1M
$0.3
Output / 1M
$0.96
z-ai/glm-5.1
GLM 5.1
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
200K
Best for
Incremental Z.AI update. Better tool routing than GLM-5
Input / 1M
$0.49
Output / 1M
$1.54
z-ai/glm-5.2
GLM 5.2
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
1M
Best for
Latest GLM with 1M context. Long-doc summarization
Input / 1M
$0.7
Output / 1M
$2.20
moonshotai/kimi-k2.6
Kimi K2.6
Moonshot agentic model with vision. Long context, great for tools.
Context
256K
Best for
Vision + agentic tools, 256K context. Great for OCR + workflow
Input / 1M
$0.34
Output / 1M
$1.71
moonshotai/kimi-k2.7-code
Kimi K2.7 Code
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
256K
Best for
Latest Kimi coding variant. Strong at Python + JS agents
Input / 1M
$0.375
Output / 1M
$1.75
Image generation
Gemini Flash Image, Gemini 3 Pro Image, GPT-5 Image variants. Billed per image at the underlying vendor rate ÷ 2.
google/gemini-2.5-flash-image
Gemini 2.5 Flash Image
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
—
Best for
Fast image edits, product mockups, iteration
google/gemini-3-pro-image
Gemini 3 Pro Image
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
—
Best for
Highest-fidelity Gemini image gen. Marketing hero shots
google/gemini-3.1-flash-image
Gemini 3.1 Flash Image
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
—
Best for
Cheap fast image gen. Bulk asset production
openai/gpt-5-image
GPT-5 Image
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
—
Best for
OpenAI photorealism. Best for advertising and product renders
openai/gpt-5-image-mini
GPT-5 Image Mini
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
—
Best for
Cheaper GPT-5 image tier. High-volume creative iteration
openai/gpt-5.4-image-2
GPT-5.4 Image 2
Frontier model in the Nimbus catalog. See the pricing page for full rate details.
Context
—
Best for
Latest GPT-5.4 image, best prompt adherence in the OpenAI line
Choosing a model
- Highest quality, budget flexible: Claude Opus 4.8 or GPT-5.5.
- Best value coding: GPT-5.3 Codex for hosted, DeepSeek V4 Pro or Qwen3 Coder for bulk.
- Cheapest 1M context: Gemini 3 Flash Preview or DeepSeek V4 Pro.
- Real-time chat: Claude Haiku 4.5 or GPT-5.4 mini.
- Vision: Claude Opus / Sonnet / Haiku 4.x, GPT-5.5, Gemini 3.1 Pro, Kimi K2.6.