Xiaomi: MiMo-V2-Omni
Xiaomi: MiMo-V2-Omni is a audio model for agent workflows and tool use. It combines multimodal input handling and reliable tool use and agent behavior with a 262K tokens context window and a balanced-cost profile. Use it for audio understanding and multimodal input when quality, speed, and cost matters.
Input
$0.40/1M
Output
$2.00/1M
Cached
$0.08/1M
Batch
$0.20/1M