All models

OpenAI

GPT-OSS 120B

OpenAI's first open-weight model since GPT-2. Mixture-of-experts 120B with strong general reasoning at a price that's hard to beat.

120B (MoE, ~5B active)Apache 2.0
Context

131.072k

Tokens / sec

70

TTFT

240ms

Hardware

AWS Trainium

Pricing

Input
$0.04 / 1M tokens
Output
$0.18 / 1M tokens
Context cache
50% of input rate, automatic
Fine-tunes
Same per-token price as base

Capabilities

  • SSE streaming
  • Tool / function calling
  • Structured JSON outputs

Use cases

General chatCodingReasoningTool use

Quickstart

Full quickstart
gpt-oss-120b.py
from openai import OpenAI client = OpenAI( base_url="https://api.cogito.decart.ai/v1", api_key=os.environ["COGITO_API_KEY"], ) response = client.chat.completions.create( model="gpt-oss-120b", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content)