Qwen Models API Cost Calculator & Comparison

Every Qwen model, side by side — current API rates, context window, benchmarks, and a live calculator that ranks them at your exact workload. 45 active models, 45 with public pricing. Prices refreshed daily.

Models tracked

45

Active

45

With public pricing

45

Cheapest input

$0.00/1M

Calculate your Qwen API cost at your workload.

Set your workload — every priced model ranks in real time.

Adjust the workload

Every model below updates in real time.

1,00010,00050,000250,0001M10M

Ranked by your monthly bill

No models with public pricing available to compare right now.

Pricing at a glance

Blended $/1M tokens across the lineup.

Blended price uses a 3-to-1 input/output ratio. Green bar = cheapest.

Quick picks

Best Qwen model for your use case.

As of April 2026, Qwen offers 45 active models via API, ranging from $0/1M to $1.04/1M input tokens. The most context-rich model handles up to 1M tokens. Models support vision, deep reasoning, tool use. All prices are USD per 1 million tokens.

Quality vs price

Qwen benchmarks at a glance.

Each point is one model — X is blended $/1M tokens, Y is the average of available quality benchmarks. Larger bubbles mean larger context windows.

Per-model benchmark scores

ModelAvgScores
Qwen3.5 397B A17B89.2
AIME 202591.3MMLU88.6MMLU Pro87.8
Qwen3.5 Plus 2026-02-1587.2
AIME 202591.3HumanEval79.3IFEval92.6LiveCodeBench83.6MMLU88.6MMLU Pro87.8
Qwen3.5-27B86.8
GPQA Diamond85.5IFEval95LiveCodeBench80.7MMLU Pro86.1
Qwen3.5-122B-A10B83.5
GPQA Diamond86.6IFEval93.4LiveCodeBench78.9MMLU Pro86.7SWE-Bench Verified72
Qwen3 Next 80B A3B Thinking81.1
AIME 202587.8GPQA Diamond77.2IFEval88.9LiveCodeBench68.7MMLU Pro82.7
Qwen3.5-35B-A3B81.0
GPQA Diamond84.2IFEval91.9LiveCodeBench74.6MMLU Pro85.3SWE-Bench Verified69.2
Qwen3.5-9B80.3
GPQA Diamond81.7IFEval91.5LiveCodeBench65.6MMLU Pro82.5
Qwen3 30B A3B Thinking 250778.8
AIME 202585GPQA Diamond73.4IFEval88.9LiveCodeBench66MMLU Pro80.9
Qwen3.6 Plus78.8
SWE-Bench Verified78.8
Qwen2.5 72B Instruct73.9
BBH61.9HumanEval86.6IFEval86.4LiveCodeBench55.5MATH83.1MMLU86MMLU Pro58.1
Qwen3 Next 80B A3B Instruct73.6
AIME 202569.5IFEval87.6LiveCodeBench56.6MMLU Pro80.6
Qwen3 Next 80B A3B Instruct73.6
AIME 202569.5IFEval87.6LiveCodeBench56.6MMLU Pro80.6
Qwen3 Max73.2
AIME 202581.6GPQA Diamond72.6LiveCodeBench69SWE-Bench Verified69.6
Qwen3 Max Thinking72.6
GPQA Diamond72.6
Qwen3 Coder 480B A35B69.6
SWE-Bench Verified69.6
Qwen3 Coder 480B A35B69.6
SWE-Bench Verified69.6
Qwen3 30B A3B Instruct 250767.6
AIME 202561.3GPQA Diamond70.4IFEval84.7LiveCodeBench43.2MMLU Pro78.4
Qwen3 30B A3B66.8
AIME 202480.4GPQA Diamond43.9MMLU81.4MMLU Pro61.5
Qwen3 32B64.0
AIME 202481.4AIME 202572.9MMLU83.3MMLU Pro65.5SciPredict17.0
Qwen2.5 Coder 32B Instruct62.1
BBH52.3HumanEval92.7IFEval72.7LiveCodeBench55MMLU Pro37.9
Qwen3 14B60.6
GPQA Diamond39.9MMLU81MMLU Pro61
Qwen2.5 7B Instruct54.9
BBH34.9HumanEval57.9IFEval75.8MATH49.8MMLU74.2MMLU Pro36.5
QwQ 32B46.4
AIME 202479.5BBH2.9IFEval83.9LiveCodeBench63.4MMLU Pro2.2
Qwen3 235B A22B Instruct 250746.1
AIME 202592.3FrontierMath Tier-40.0%
Qwen3 235B A22B Thinking 250746.1
AIME 202592.3FrontierMath Tier-40.0%

Open weights

Open Models from Qwen

Qwen ships 32 open-source or open-weights models you can self-host or fine-tune. Each links to its Hugging Face card.

Every model

Every Qwen model — pricing, context & capabilities.

ModelContextInput /1MOutput /1M
Qwen3.6 Plus1M$0.325$1.95
Qwen3 Coder Next262K$0.15$0.8
Qwen3.5-122B-A10B262K$0.26$2.08
Qwen3.5-27B262K$0.195$1.56
Qwen3.5-35B-A3B262K$0.163$1.30
Qwen3.5-9B262K$0.1$0.15
Qwen3.5-Flash1M$0.065$0.26
Qwen3.5 397B A17B262K$0.39$2.34
Qwen3.5 Plus 2026-02-151M$0.26$1.56
Qwen-Plus1M$0.26$0.78
Qwen3 VL 30B A3B Instruct131K$0.13$0.52
Qwen3 VL 30B A3B Thinking131K$0.13$1.56
Qwen3 VL 32B Instruct131K$0.104$0.416
Qwen3 VL 8B Instruct131K$0.08$0.5
Qwen3 VL 8B Thinking131K$0.117$1.36
Qwen3 VL 235B A22B Instruct262K$0.2$0.88
Qwen3 VL 235B A22B Thinking131K$0.26$2.60
Qwen3 Max262K$0.78$3.90
Qwen3 Max Thinking262K$0.78$3.90
Qwen3 Next 80B A3B Instruct262K$0.09$1.10
Qwen3 Next 80B A3B Instruct262K$0.0$0.0
Qwen3 Next 80B A3B Thinking131K$0.098$0.78
Qwen3 Coder Flash1M$0.195$0.975
Qwen3 Coder 30B A3B Instruct160K$0.07$0.27
Qwen Plus 0728 (thinking)1M$0.26$0.78
Qwen3 235B A22B Instruct 2507262K$0.071$0.1
Qwen3 235B A22B Thinking 2507262K$0.13$0.6
Qwen3 30B A3B Instruct 2507262K$0.09$0.3
Qwen3 30B A3B Thinking 2507131K$0.08$0.4
Qwen3 Coder 480B A35B262K$0.22$1.00
Qwen3 Coder 480B A35B262K$0.0$0.0
Qwen3 Coder Plus1M$0.65$3.25
Qwen-Turbo131K$0.033$0.13
Qwen3 14B41K$0.06$0.24
Qwen3 30B A3B41K$0.08$0.28
Qwen3 32B41K$0.08$0.24
Qwen3 8B41K$0.05$0.4
QwQ 32B131K$0.15$0.58
Qwen2.5 VL 72B Instruct32K$0.25$0.75
Qwen-Max33K$1.04$4.16
Qwen2.5 Coder 32B Instruct33K$0.66$1.00
Qwen2.5 7B Instruct33K$0.04$0.1
Qwen2.5 72B Instruct33K$0.12$0.39
Qwen VL Max131K$0.52$2.08
Qwen VL Plus131K$0.137$0.409

FAQ

Domande frequenti

Pricing patterns, best-known use cases, and how this provider stacks up.

Get instant answers from our AI agent

Qwen API pricing ranges from $0 to $1.04 per 1M input tokens. Output tokens cost more than input on every model. Prices are per 1 million tokens (1M ≈ 750,000 words). Use the calculator above to estimate your monthly spend at your actual workload.
Qwen3 Next 80B A3B Instruct is the lowest-priced Qwen model with public pricing at $0/1M input tokens. It suits high-volume tasks where cost matters most — classification, extraction, summarization, and similar workloads that don't need frontier reasoning.
Qwen-Max is Qwen's highest-tier model at $1.04/1M input. It delivers the most sophisticated reasoning, instruction-following, and nuance. For workloads that don't require frontier performance, a mid-tier model typically cuts inference costs substantially.
Qwen3.6 Plus, Qwen3 Coder Next, Qwen3.5-122B-A10B and 21 more support deep reasoning mode, which improves performance on multi-step coding, debugging, and code review. For simpler autocomplete or snippet generation, a faster, cheaper model often delivers acceptable quality at a fraction of the cost.
Qwen3.6 Plus, Qwen3 Coder Next, Qwen3.5-122B-A10B and 42 more support function calling (tool use), required for agentic workflows. Agents need a model that reliably follows structured output schemas — test with your specific tool definitions before committing to production volumes.
Yes — Qwen3.6 Plus, Qwen3.5-122B-A10B, Qwen3.5-27B, Qwen3.5-35B-A3B and 17 more accept image input alongside text. You can pass screenshots, photos, charts, and documents for analysis. Vision adds no separate line-item on most Qwen models — you're billed for the token equivalent of the image.
Yes — Qwen supports prompt caching (discounts for repeated context) and batch processing (accept a delay, cut costs ~50%). These rates appear in the table above under "Cached /1M" and "Batch /1M." Caching pays off quickly if your prompts share a long system prompt or document prefix across many calls.
Qwen has historically adjusted prices when launching new model generations, often cutting rates to stay competitive. Buzzi.ai snapshots pricing daily — you can subscribe to price-drop alerts on any Qwen model using the "Alert me" button on its detail page.
Use the main comparison wizard to run the same calculator across Qwen, Anthropic, Google, Meta, Mistral, and 20+ other providers. Set your exact workload and get a ranked cost chart in under a minute.
Qwen3.6 Plus, Qwen3 Coder Next, Qwen3.5-122B-A10B, Qwen3.5-27B and 20 more offer an extended thinking or reasoning mode. The model spends extra compute "thinking" before answering — slower and more expensive, but meaningfully better on complex, multi-step problems. Standard mode is faster and cheaper for routine tasks.

Look wider

Compare Qwen against other providers.

Open the full wizard — pick a use case, set your usage, and cross-compare against OpenAI, Anthropic, Google, and 20+ more.