All models

Moonshot

Kimi K2.6

Moonshot's flagship MoE. Trained for long-horizon agentic workflows; the model engineering teams reach for when the cheap models stop being enough.

1T MoE (~32B active)Modified MIT
Context

262.144k

Tokens / sec

70

TTFT

280ms

Hardware

NVIDIA H100

Pricing

Input
$0.74 / 1M tokens
Output
$3.49 / 1M tokens
Context cache
50% of input rate, automatic
Fine-tunes
Same per-token price as base

Capabilities

  • SSE streaming
  • Tool / function calling
  • Structured JSON outputs

Use cases

Coding agentsLong-horizon reasoningComplex tool use

Quickstart

Full quickstart
kimi-k2.6.py
from openai import OpenAI client = OpenAI( base_url="https://api.cogito.decart.ai/v1", api_key=os.environ["COGITO_API_KEY"], ) response = client.chat.completions.create( model="kimi-k2.6", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content)