mimo-v2-flash
Xiaomi MiMo-V2-Flash is a proprietary MoE model developed by Xiaomi, designed for extreme inference efficiency with 309B total parameters (15B active). By incorporating an innovative Hybrid attention architecture and multi-layer MTP inference acceleration, it ranks among the top 2 global open-source models across multiple Agent benchmarks.
| Provider | Source | Input Price ($/1M) | Output Price ($/1M) | Description | Free |
|---|---|---|---|---|---|
| vercel | vercel | Input: $0.10 | Output: $0.29 | Xiaomi MiMo-V2-Flash is a proprietary MoE model developed by Xiaomi, designed for extreme inference efficiency with 309B total parameters (15B active). By incorporating an innovative Hybrid attention architecture and multi-layer MTP inference acceleration, it ranks among the top 2 global open-source models across multiple Agent benchmarks. | |
| xiaomi | models-dev | Input: $0.07 | Output: $0.21 | Provider: Xiaomi, Context: 256000, Output Limit: 32000 | |
| chutes | models-dev | Input: $0.17 | Output: $0.65 | Provider: Chutes, Context: 40960, Output Limit: 40960 | |
| zenmux | models-dev | Input: $0.00 | Output: $0.00 | Provider: ZenMux, Context: 262144, Output Limit: 64000 |