Cost caps

Set a hard USD ceiling per API key. When the cap is hit, the gateway returns 402 and blocks every subsequent request until the cap resets or you raise it.

Every Nimbus API key can have an independent daily cap, monthly cap, or both. Caps are enforced at the edge before the request touches an upstream vendor, so there is no way to overshoot even by one token. The account balance is separate — you can hold $10,000 in the account and still restrict a debug key to $5/day.

Why caps matter

  • Runaway loops. A misconfigured agent that retries a failed tool call forever can drain a $500 balance in under twenty minutes on Claude Opus. A $50/day cap turns that into a $50 mistake.
  • Leaked keys. A key committed to a public repo gets found and abused within minutes. A cap turns credential theft into a bounded loss.
  • Multi-tenant isolation. If you resell Nimbus to your own customers, cap each customer's key at their monthly plan value.
  • Environment separation. Cap the production key at $10k/mo and the staging key at $50/mo so a bad deploy can't drain the prod runway.

Set a cap from the dashboard

  1. Open /dashboard/keys. Every key has a Cap column that shows current daily & monthly limits.
  2. Click Edit on the key row. A side panel opens with three inputs: Daily cap (USD), Monthly cap (USD), and a hard-cap toggle.
  3. Fill in a number or leave blank for no cap. Blank means unlimited (still bounded by account balance). Click Save.
  4. Add up to three alert thresholds (defaults: 50%, 80%, 95% of the cap). Each threshold fires an email to the account owner and posts to any Discord webhook you register in Settings.

Set a cap from the API

Use the account token (from /dashboard/settings) — API keys themselves cannot raise their own cap.

bash
curl -sS https://llm.nimbusapi.net/v1/keys/KEY_ID/cap \
  -H "Authorization: Bearer $NIMBUS_ACCOUNT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "daily_cap_usd": 25.00,
    "monthly_cap_usd": 500.00,
    "hard_cap": true
  }'

Read cap status

bash
curl -sS https://llm.nimbusapi.net/v1/keys/KEY_ID/cap \
  -H "Authorization: Bearer $NIMBUS_ACCOUNT_TOKEN"

# {
#   "key_id": "nk_live_ab12...",
#   "daily_cap_usd": 25.00,
#   "daily_spent_usd": 8.42,
#   "monthly_cap_usd": 500.00,
#   "monthly_spent_usd": 214.06,
#   "hard_cap": true,
#   "alert_thresholds": [0.5, 0.8, 0.95]
# }

What happens when the cap hits

The next request returns HTTP 402 Payment Required with an insufficient_balance error type. The response body includes exactly which cap was hit and when it resets. Daily caps reset at 00:00 UTC. Monthly caps reset at 00:00 UTC on the 1st.

json
{
  "error": {
    "type": "insufficient_balance",
    "code": "cap_exceeded",
    "message": "API key daily cap of $25.00 exceeded. Spent $25.03 today. Reset at 00:00 UTC.",
    "cap_type": "daily",
    "cap_usd": 25.00,
    "spent_usd": 25.03,
    "reset_at": "2026-07-02T00:00:00Z"
  }
}

Hard cap vs soft cap

  • Hard cap (default). The 402 response blocks the request completely. No upstream call is made, no tokens are billed. Recommended for production keys.
  • Soft cap. Requests are still served past the cap, but the alert email fires every 5 minutes until you raise the cap or the period resets. Useful when uptime matters more than budget precision (e.g. a customer-facing product where a 402 is worse than a small overrun).

Alerts

Alerts fire on the way up, not on reset. When usage crosses each threshold (50%, 80%, 95% by default), we send one email to the account owner and one Discord webhook call. The payload:

json
{
  "event": "cap_threshold_crossed",
  "key_id": "nk_live_ab12...",
  "key_name": "prod-web",
  "cap_type": "daily",
  "cap_usd": 25.00,
  "spent_usd": 20.13,
  "threshold": 0.8,
  "timestamp": "2026-07-01T14:32:18Z"
}

Register a Discord webhook at /dashboard/settings. Same webhook receives balance-low alerts and invoice-ready notifications.

Interaction with account balance

The account balance is a global cap. The per-key cap is a per-key cap. Whichever fires first blocks the request. In practice: if account balance = $200 and a key has a $500 daily cap, the balance runs out first. If account balance = $10,000 and the same key has a $50 daily cap, the key hits its cap first. See Top up for how the account balance ledger works.