DeepSeek

DeepSeek V3.2

The proven workhorse. DeepSeek V3.2's reasoning is more than enough for 90% of production workloads, at one of the lowest prices in the catalog.

671B MoE (~37B active)DeepSeek License

Context

131.072k

Tokens / sec

TTFT

260ms

Hardware

AWS Trainium

Pricing

Input: $0.25 / 1M tokens
Output: $0.38 / 1M tokens
Context cache: 50% of input rate, automatic
Fine-tunes: Same per-token price as base

Capabilities

SSE streaming
Tool / function calling
Structured JSON outputs

Use cases

Production chatCodingCost-sensitive workloads

Quickstart

Full quickstart

deepseek-v3.2.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.cogito.decart.ai/v1",
    api_key=os.environ["COGITO_API_KEY"],
)

response = client.chat.completions.create(
    model="deepseek-v3.2",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)