Model vs model, real cost

LLM API cost comparisons

The per-token rate is the easy part. What actually decides your bill is the input/output mix of your workload. Each comparison below runs both models across the same four real jobs — and stays current, because every page reads live from our pricing data instead of a blog post that went stale six months ago.

Comparisons
10
Workloads each
4
Pricing
Live

Every head-to-head

How we compare

Every comparison prices the same four workloads, so the input-heavy vs output-heavy trade-off between two models is visible at a glance. A model that's cheapest on short chats can be the expensive one on long-document summaries — the mix is the whole game.

  • Short chat turn

    1K in / 500 out — a typical assistant reply

  • RAG answer

    8K in / 800 out — retrieved context + grounded answer

  • Long-doc summary

    50K in / 2K out — summarize a long document

  • Bulk classification

    2K in / 50 out — label/route at high volume

Go deeper