All models
DeepSeek
DeepSeek V3.2
The proven workhorse. DeepSeek V3.2's reasoning is more than enough for 90% of production workloads, at one of the lowest prices in the catalog.
671B MoE (~37B active)DeepSeek License
Context
131.072k
Tokens / sec
70
TTFT
260ms
Hardware
AWS Trainium
Pricing
- Input
- $0.25 / 1M tokens
- Output
- $0.38 / 1M tokens
- Context cache
- 50% of input rate, automatic
- Fine-tunes
- Same per-token price as base
Capabilities
- SSE streaming
- Tool / function calling
- Structured JSON outputs
Use cases
Production chatCodingCost-sensitive workloads
Quickstart
Full quickstartdeepseek-v3.2.py
from openai import OpenAI
client = OpenAI(
base_url="https://api.cogito.decart.ai/v1",
api_key=os.environ["COGITO_API_KEY"],
)
response = client.chat.completions.create(
model="deepseek-v3.2",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)