Moonshot

Kimi K2.6

Moonshot's flagship MoE. Trained for long-horizon agentic workflows; the model engineering teams reach for when the cheap models stop being enough.

1T MoE (~32B active)Modified MIT

Context

262.144k

Tokens / sec

TTFT

280ms

Hardware

NVIDIA H100

Pricing

Input: $0.74 / 1M tokens
Output: $3.49 / 1M tokens
Context cache: 50% of input rate, automatic
Fine-tunes: Same per-token price as base

Capabilities

SSE streaming
Tool / function calling
Structured JSON outputs

Use cases

Coding agentsLong-horizon reasoningComplex tool use

Quickstart

Full quickstart

kimi-k2.6.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.cogito.decart.ai/v1",
    api_key=os.environ["COGITO_API_KEY"],
)

response = client.chat.completions.create(
    model="kimi-k2.6",
    messages=[{"role": "user", "content": "Hello!"}],
)

print(response.choices[0].message.content)