live ttft228ms

Cogito,
ergo ship.

Open-source LLMs and your fine-tunes — served on AWS Trainium and NVIDIA GPUs at frontier speed. OpenAI-compatible from line one.

Drop-in OpenAI compatibleNo cold startsPay per token, not idle
stream.py
from openai import OpenAI client = OpenAI( base_url="https://api.cogito.decart.ai/v1", api_key=os.environ["COGITO_API_KEY"], ) stream = client.chat.completions.create( model="gpt-oss-120b", messages=[{"role": "user", "content": "Why is the sky blue?"}], stream=True, ) for chunk in stream: print(chunk.choices[0].delta.content or "", end="")

The first six teams to ship on Cogito

Reserved for early-access partners. Your logo could be one of them.

01 · open
02 · open
03 · open
04 · open
05 · open
06 · open

Fast inference. No asterisks.

Built like infrastructure should be.

Three things we obsess over so your application doesn't have to.

Trainium + GPU, routed for you

We pick the right silicon for each model: AWS Trainium for cost-efficient throughput, NVIDIA H100 / H200 for the heaviest configurations. You just call the API.

Drop-in OpenAI compatible

Swap base_url and api_key. Streaming SSE, function calling, structured JSON outputs — all match the OpenAI spec. No client rewrite, no surprises.

No asterisks

Token-aware rate limits, deterministic error schemas with request IDs, zero retention by default, hard spend caps. The infrastructure niceties incumbents don't bother with.

Enterprise

Inference your security team can sign off on.

SOC 2 Type II certified through Decart AI, GDPR-compliant by design, single-tenant VPC deployments, contractual P99 SLAs, zero retention by default. Open-source models without the open-source compliance gap.

  • SOC 2 Type II
    Certified
  • GDPR
    Compliant
  • HIPAA BAA
    In progress

Ship something today.

$5 in free credits. No credit card. Five minutes from sign-up to your first streamed response.