Pricing

Per token. No idle compute. Caps you can trust.

You pay for tokens used, not GPUs sitting idle. Cached input tokens bill at 50% off automatically. Fine-tuned variants price the same as the base model.

Free

$0

no card required

Test every model. Free credits to evaluate.

  • $5 in starter credits
  • All public models
  • Up to 60 RPM, 60k TPM
  • Community support
Start free

Pro

most popular

Pay as you go

per-token billing

Production traffic, transparent telemetry, hard spend caps.

  • All models, all features
  • Up to 600 RPM, 600k TPM
  • Hard spend caps & alerts
  • P99 latency dashboard
  • 50% off cached input tokens
  • Email support
Start building

Enterprise

Custom

annual contract

VPC isolation, single-tenant clusters, contractual SLAs.

  • Everything in Pro
  • Single-tenant on Trainium / GPU
  • P99 latency SLA
  • SOC 2 Type II (via Decart AI)
  • GDPR + EU data residency
  • HIPAA BAA on roadmap
  • VPC peering, zero egress
  • Dedicated solutions engineer
Talk to engineering

Calculator

Math we don't hide.

Plug in monthly volume to see what your bill actually looks like.

Model
Input tokens2.0M / month
Output tokens1.0M / month
Cached input share0% of input tokens repeat (50% discount)

All models, all prices

Updated quarterly. Past pricing posted in the changelog.

Model$/1M input$/1M outputCached input

GPT-OSS 120B

OpenAI

$0.04$0.18$0.02

DeepSeek V4 Pro

DeepSeek

$0.43$0.87$0.22

DeepSeek V4 Flash

DeepSeek

$0.14$0.28$0.07

Kimi K2.6

Moonshot

$0.74$3.49$0.37

DeepSeek V3.2

DeepSeek

$0.25$0.38$0.13