Ideal para: Cheap Bulk Workloads
Best LLM for Cheap Bulk Workloads
Ranked primarily on input and output $/1M with a benchmark floor so you do not ship junk at volume.
Atualizado May 2026. Top 3 deste mês: MiMo-V2-Flash, Hunyuan A13B Instruct, Phi 4.
Ideal para: Cheap Bulk Workloads
Ranked primarily on input and output $/1M with a benchmark floor so you do not ship junk at volume.
Atualizado May 2026. Top 3 deste mês: MiMo-V2-Flash, Hunyuan A13B Instruct, Phi 4.
Podium
Como classificamos
Some workloads are massive but forgiving — classification, tagging, summarization, PII scrubbing. The question is: what is the cheapest model that still clears the quality floor? We weight price dominantly here but set a benchmark floor so the recommendation is not useless.
Our full methodology is published on the página de metodologia.
Pilares e pesos:
Full ranking
| Posição | Modelo | Fornecedor | Entrada $/1M | Saída $/1M | Contexto |
|---|---|---|---|---|---|
| 1 | MiMo-V2-Flash | Xiaomi | $0.09 | $0.29 | 262,144 |
| 2 | Hunyuan A13B Instruct | Tencent | $0.14 | $0.57 | 131,072 |
| 3 | Phi 4 | Microsoft | $0.07 | $0.14 | 16,384 |
| 4 | Llama 3.3 70B Instruct | Meta | $0.12 | $0.38 | 131,072 |
| 5 | Qwen2.5 72B Instruct | Qwen | $0.12 | $0.39 | 32,768 |
| 6 | Gemma 4 31B | $0.13 | $0.38 | 262,144 | |
| 7 | Olmo 3 32B Think | Allen AI | $0.15 | $0.50 | 65,536 |
| 8 | Qwen3 32B | Qwen | $0.08 | $0.24 | 40,960 |
| 9 | Llama 3.1 70B Instruct | Meta | $0.40 | $0.40 | 131,072 |
| 10 | Qwen3.5-9B | Qwen | $0.10 | $0.15 | 262,144 |
Field notes
Use batch pricing aggressively. 50%+ discounts are common.
Use cached-input pricing for repeating preambles.
A cheaper model with a short retry loop often beats a more expensive model one-shot.
FAQ
The questions teams ask before picking a model for cheap bulk workloads.
Get instant answers from our AI agent