Qwen Models API Cost Calculator & Comparison

Every Qwen model, side by side — current API rates, context window, benchmarks, and a live calculator that ranks them at your exact workload. 45 active models, 45 with public pricing. Prices refreshed daily.

Models tracked

45

Active

45

With public pricing

45

Cheapest input

$0.00/1M

Calculate your Qwen API cost at your workload.

Set your workload — every priced model ranks in real time.

Adjust the workload

Every model below updates in real time.

1,00010,00050,000250,0001M10M

Ranked by your monthly bill

No models with public pricing available to compare right now.

Pricing at a glance

Blended $/1M tokens across the lineup.

Blended price uses a 3-to-1 input/output ratio. Green bar = cheapest.

Quick picks

Best Qwen model for your use case.

As of April 2026, Qwen offers 45 active models via API, ranging from $0/1M to $1.04/1M input tokens. The most context-rich model handles up to 1M tokens. Models support vision, deep reasoning, tool use. All prices are USD per 1 million tokens.

Quality vs price

Qwen benchmarks at a glance.

Each point is one model — X is blended $/1M tokens, Y is the average of available quality benchmarks. Larger bubbles mean larger context windows.

Per-model benchmark scores

ModelAvgScores
Qwen3.5 397B A17B89.2
MMLU88.6MMLU Pro87.8AIME 202591.3
Qwen3.5 Plus 2026-02-1587.2
MMLU88.6MMLU Pro87.8HumanEval79.3AIME 202591.3IFEval92.6LiveCodeBench83.6
Qwen3.5-27B86.8
MMLU Pro86.1GPQA Diamond85.5IFEval95LiveCodeBench80.7
Qwen3.5-122B-A10B83.5
MMLU Pro86.7GPQA Diamond86.6IFEval93.4SWE-Bench Verified72LiveCodeBench78.9
Qwen3 Next 80B A3B Thinking81.1
MMLU Pro82.7GPQA Diamond77.2AIME 202587.8IFEval88.9LiveCodeBench68.7
Qwen3.5-35B-A3B81.0
MMLU Pro85.3GPQA Diamond84.2IFEval91.9LiveCodeBench74.6SWE-Bench Verified69.2
Qwen3.5-9B80.3
MMLU Pro82.5GPQA Diamond81.7IFEval91.5LiveCodeBench65.6
Qwen3 30B A3B Thinking 250778.8
MMLU Pro80.9GPQA Diamond73.4AIME 202585IFEval88.9LiveCodeBench66
Qwen3.6 Plus78.8
SWE-Bench Verified78.8
Qwen2.5 72B Instruct73.9
MMLU86MMLU Pro58.1HumanEval86.6MATH83.1LiveCodeBench55.5IFEval86.4BBH61.9
Qwen3 Next 80B A3B Instruct73.6
MMLU Pro80.6AIME 202569.5IFEval87.6LiveCodeBench56.6
Qwen3 Next 80B A3B Instruct73.6
MMLU Pro80.6AIME 202569.5IFEval87.6LiveCodeBench56.6
Qwen3 Max73.2
SWE-Bench Verified69.6AIME 202581.6LiveCodeBench69GPQA Diamond72.6
Qwen3 Max Thinking72.6
GPQA Diamond72.6
Qwen3 Coder 480B A35B69.6
SWE-Bench Verified69.6
Qwen3 Coder 480B A35B69.6
SWE-Bench Verified69.6
Qwen3 30B A3B Instruct 250767.6
MMLU Pro78.4GPQA Diamond70.4AIME 202561.3IFEval84.7LiveCodeBench43.2
Qwen3 30B A3B66.8
MMLU81.4MMLU Pro61.5GPQA Diamond43.9AIME 202480.4
Qwen3 32B64.0
MMLU83.3MMLU Pro65.5AIME 202481.4AIME 202572.9SciPredict17.0
Qwen2.5 Coder 32B Instruct62.1
HumanEval92.7LiveCodeBench55IFEval72.7BBH52.3MMLU Pro37.9
Qwen3 14B60.6
MMLU81MMLU Pro61GPQA Diamond39.9
Qwen2.5 7B Instruct54.9
MMLU74.2HumanEval57.9MATH49.8IFEval75.8BBH34.9MMLU Pro36.5
QwQ 32B46.4
AIME 202479.5IFEval83.9LiveCodeBench63.4BBH2.9MMLU Pro2.2
Qwen3 235B A22B Instruct 250746.1
AIME 202592.3FrontierMath Tier-40.0%
Qwen3 235B A22B Thinking 250746.1
AIME 202592.3FrontierMath Tier-40.0%

Open weights

Open Models from Qwen

Qwen ships 32 open-source or open-weights models you can self-host or fine-tune. Each links to its Hugging Face card.

Every model

Every Qwen model — pricing, context & capabilities.

ModelContextInput /1MOutput /1M
Qwen3.6 Plus1M$0.325$1.95
Qwen3 Coder Next262K$0.15$0.8
Qwen3.5-122B-A10B262K$0.26$2.08
Qwen3.5-27B262K$0.195$1.56
Qwen3.5-35B-A3B262K$0.163$1.30
Qwen3.5-9B262K$0.1$0.15
Qwen3.5-Flash1M$0.065$0.26
Qwen3.5 397B A17B262K$0.39$2.34
Qwen3.5 Plus 2026-02-151M$0.26$1.56
Qwen-Plus1M$0.26$0.78
Qwen3 VL 30B A3B Instruct131K$0.13$0.52
Qwen3 VL 30B A3B Thinking131K$0.13$1.56
Qwen3 VL 32B Instruct131K$0.104$0.416
Qwen3 VL 8B Instruct131K$0.08$0.5
Qwen3 VL 8B Thinking131K$0.117$1.36
Qwen3 VL 235B A22B Instruct262K$0.2$0.88
Qwen3 VL 235B A22B Thinking131K$0.26$2.60
Qwen3 Max262K$0.78$3.90
Qwen3 Max Thinking262K$0.78$3.90
Qwen3 Next 80B A3B Instruct262K$0.09$1.10
Qwen3 Next 80B A3B Instruct262K$0.0$0.0
Qwen3 Next 80B A3B Thinking131K$0.098$0.78
Qwen3 Coder Flash1M$0.195$0.975
Qwen3 Coder 30B A3B Instruct160K$0.07$0.27
Qwen Plus 0728 (thinking)1M$0.26$0.78
Qwen3 235B A22B Instruct 2507262K$0.071$0.1
Qwen3 235B A22B Thinking 2507262K$0.13$0.6
Qwen3 30B A3B Instruct 2507262K$0.09$0.3
Qwen3 30B A3B Thinking 2507131K$0.08$0.4
Qwen3 Coder 480B A35B262K$0.22$1.00
Qwen3 Coder 480B A35B262K$0.0$0.0
Qwen3 Coder Plus1M$0.65$3.25
Qwen-Turbo131K$0.033$0.13
Qwen3 14B41K$0.06$0.24
Qwen3 30B A3B41K$0.08$0.28
Qwen3 32B41K$0.08$0.24
Qwen3 8B41K$0.05$0.4
QwQ 32B131K$0.15$0.58
Qwen2.5 VL 72B Instruct32K$0.25$0.75
Qwen-Max33K$1.04$4.16
Qwen2.5 Coder 32B Instruct33K$0.66$1.00
Qwen2.5 7B Instruct33K$0.04$0.1
Qwen2.5 72B Instruct33K$0.12$0.39
Qwen VL Max131K$0.52$2.08
Qwen VL Plus131K$0.137$0.409

FAQ

よくある質問

Pricing patterns, best-known use cases, and how this provider stacks up.

Get instant answers from our AI agent

Qwen API pricing ranges from $0 to $1.04 per 1M input tokens. Output tokens cost more than input on every model. Prices are per 1 million tokens (1M ≈ 750,000 words). Use the calculator above to estimate your monthly spend at your actual workload.
Qwen3 Next 80B A3B Instruct is the lowest-priced Qwen model with public pricing at $0/1M input tokens. It suits high-volume tasks where cost matters most — classification, extraction, summarization, and similar workloads that don't need frontier reasoning.
Qwen-Max is Qwen's highest-tier model at $1.04/1M input. It delivers the most sophisticated reasoning, instruction-following, and nuance. For workloads that don't require frontier performance, a mid-tier model typically cuts inference costs substantially.
Qwen3.6 Plus, Qwen3 Coder Next, Qwen3.5-122B-A10B and 21 more support deep reasoning mode, which improves performance on multi-step coding, debugging, and code review. For simpler autocomplete or snippet generation, a faster, cheaper model often delivers acceptable quality at a fraction of the cost.
Qwen3.6 Plus, Qwen3 Coder Next, Qwen3.5-122B-A10B and 42 more support function calling (tool use), required for agentic workflows. Agents need a model that reliably follows structured output schemas — test with your specific tool definitions before committing to production volumes.
Yes — Qwen3.6 Plus, Qwen3.5-122B-A10B, Qwen3.5-27B, Qwen3.5-35B-A3B and 17 more accept image input alongside text. You can pass screenshots, photos, charts, and documents for analysis. Vision adds no separate line-item on most Qwen models — you're billed for the token equivalent of the image.
Yes — Qwen supports prompt caching (discounts for repeated context) and batch processing (accept a delay, cut costs ~50%). These rates appear in the table above under "Cached /1M" and "Batch /1M." Caching pays off quickly if your prompts share a long system prompt or document prefix across many calls.
Qwen has historically adjusted prices when launching new model generations, often cutting rates to stay competitive. Buzzi.ai snapshots pricing daily — you can subscribe to price-drop alerts on any Qwen model using the "Alert me" button on its detail page.
Use the main comparison wizard to run the same calculator across Qwen, Anthropic, Google, Meta, Mistral, and 20+ other providers. Set your exact workload and get a ranked cost chart in under a minute.
Qwen3.6 Plus, Qwen3 Coder Next, Qwen3.5-122B-A10B, Qwen3.5-27B and 20 more offer an extended thinking or reasoning mode. The model spends extra compute "thinking" before answering — slower and more expensive, but meaningfully better on complex, multi-step problems. Standard mode is faster and cheaper for routine tasks.

Look wider

Compare Qwen against other providers.

Open the full wizard — pick a use case, set your usage, and cross-compare against OpenAI, Anthropic, Google, and 20+ more.