Meta: Llama 3.1 8B Instruct
Meta: Llama 3.1 8B Instruct is a text model for general chat, analysis, and production use. It combines low latency and efficient inference with a 16K tokens context window and a low-cost profile. Use it for general chat, analysis, and production workloads when latency, cost, and throughput matters.
Input
$0.02/1M
Output
$0.05/1M
Cached
$0.01/1M
Batch
$0.01/1M