minimax-m2
MiniMax-M2 redefines efficiency for agents. It is a compact, fast, and cost-effective MoE model (230 billion total parameters with 10 billion active parameters) built for elite performance in coding and agentic tasks, all while maintaining powerful general intelligence.
| Provider | Source | Input Price ($/1M) | Output Price ($/1M) | Description | Free |
|---|---|---|---|---|---|
| vercel | vercel | Input: $0.27 | Output: $1.15 | MiniMax-M2 redefines efficiency for agents. It is a compact, fast, and cost-effective MoE model (230 billion total parameters with 10 billion active parameters) built for elite performance in coding and agentic tasks, all while maintaining powerful general intelligence. | |
| poe | poe | Input: $3,300.00 | Output: - | MiniMax-M2 redefines efficiency for agents. It's a compact, fast, and cost-effective MoE model (230 billion total parameters with 10 billion active parameters) built for elite performance in coding and agentic tasks, all while maintaining powerful general intelligence. With just 10 billion activated parameters, MiniMax-M2 provides the sophisticated, end-to-end tool use performance expected from today's leading models, but in a streamlined form factor that makes deployment and scaling easier than ever. Technical Specifications File Support: Text, Markdown and PDF files Context window: 200k tokens | |
| nvidia | models-dev | Input: $0.00 | Output: $0.00 | Provider: Nvidia, Context: 128000, Output Limit: 16384 | |
| siliconflowcn | models-dev | Input: $0.30 | Output: $1.20 | Provider: SiliconFlow (China), Context: 197000, Output Limit: 131000 | |
| chutes | models-dev | Input: $0.26 | Output: $1.02 | Provider: Chutes, Context: 196608, Output Limit: 196608 | |
| siliconflow | models-dev | Input: $0.30 | Output: $1.20 | Provider: SiliconFlow, Context: 197000, Output Limit: 131000 | |
| huggingface | models-dev | Input: $0.30 | Output: $1.20 | Provider: Hugging Face, Context: 204800, Output Limit: 204800 | |
| minimax | models-dev | Input: $0.30 | Output: $1.20 | Provider: MiniMax, Context: 196608, Output Limit: 128000 | |
| minimaxcn | models-dev | Input: $0.30 | Output: $1.20 | Provider: MiniMax (China), Context: 196608, Output Limit: 128000 | |
| zenmux | models-dev | Input: $0.30 | Output: $1.20 | Provider: ZenMux, Context: 204800, Output Limit: 64000 | |
| iflowcn | models-dev | Input: $0.00 | Output: $0.00 | Provider: iFlow, Context: 204800, Output Limit: 131100 | |
| synthetic | models-dev | Input: $0.55 | Output: $2.19 | Provider: Synthetic, Context: 196608, Output Limit: 131000 | |
| deepinfra | models-dev | Input: $0.25 | Output: $1.02 | Provider: Deep Infra, Context: 262144, Output Limit: 32768 | |
| fireworksai | models-dev | Input: $0.30 | Output: $1.20 | Provider: Fireworks AI, Context: 192000, Output Limit: 192000 | |
| openrouter | openrouter | Input: $0.20 | Output: $1.00 | MiniMax-M2 is a compact, high-efficiency large language model optimized for end-to-end coding and agentic workflows. With 10 billion activated parameters (230 billion total), it delivers near-frontier intelligence across general reasoning, tool use, and multi-step task execution while maintaining low latency and deployment efficiency. The model excels in code generation, multi-file editing, compile-run-fix loops, and test-validated repair, showing strong results on SWE-Bench Verified, Multi-SWE-Bench, and Terminal-Bench. It also performs competitively in agentic evaluations such as BrowseComp and GAIA, effectively handling long-horizon planning, retrieval, and recovery from execution errors. Benchmarked by [Artificial Analysis](https://artificialanalysis.ai/models/minimax-m2), MiniMax-M2 ranks among the top open-source models for composite intelligence, spanning mathematics, science, and instruction-following. Its small activation footprint enables fast inference, high concurrency, and improved unit economics, making it well-suited for large-scale agents, developer assistants, and reasoning-driven applications that require responsiveness and cost efficiency. To avoid degrading this model's performance, MiniMax highly recommends preserving reasoning between turns. Learn more about using reasoning_details to pass back reasoning in our [docs](https://openrouter.ai/docs/use-cases/reasoning-tokens#preserving-reasoning-blocks). Context: 196608 |