gpt-oss-120b
Extremely capable general-purpose LLM with strong, controllable reasoning capabilities
| Provider | Source | Input Price ($/1M) | Output Price ($/1M) | Description | Free |
|---|---|---|---|---|---|
| vercel | vercel | Input: $0.10 | Output: $0.50 | Extremely capable general-purpose LLM with strong, controllable reasoning capabilities | |
| together | together | Input: $0.15 | Output: $0.60 | - | |
| poe | poe | Input: $1,200.00 | Output: - | OpenAI introduces the GPT-OSS-120B, an open-weight reasoning model available under the Apache 2.0 license and OpenAI GPT-OSS usage policy. Developed with feedback from the open-source community, this text-only model is compatible with OpenAI Responses API and is designed to be used within agentic workflows with strong instruction following, tool use like web search and Python code execution, and reasoning capabilities. The GPT-OSS-120B model achieves near-parity with OpenAI o4-mini on core reasoning benchmarks, while running efficiently on a single 80 GB GPU. This model also performs strongly on tool use, few-shot function calling, CoT reasoning (as seen in results on the Tau-Bench agentic evaluation suite) and HealthBench (even outperforming proprietary models like OpenAI o1 and GPT‑4o). Technical Specifications File Support: Attachments not supported Context window: 128k tokens | |
| vultr | models-dev | Input: $0.20 | Output: $0.20 | Provider: Vultr, Context: 121808, Output Limit: 8192 | |
| nvidia | models-dev | Input: $0.00 | Output: $0.00 | Provider: Nvidia, Context: 128000, Output Limit: 8192 | |
| groq | models-dev | Input: $0.15 | Output: $0.60 | Provider: Groq, Context: 131072, Output Limit: 65536 | |
| nebius | models-dev | Input: $0.15 | Output: $0.60 | Provider: Nebius Token Factory, Context: 131072, Output Limit: 8192 | |
| siliconflowcn | models-dev | Input: $0.05 | Output: $0.45 | Provider: SiliconFlow (China), Context: 131000, Output Limit: 8000 | |
| cortecs | models-dev | Input: $0.00 | Output: $0.00 | Provider: Cortecs, Context: 128000, Output Limit: 128000 | |
| togetherai | models-dev | Input: $0.15 | Output: $0.60 | Provider: Together AI, Context: 131072, Output Limit: 131072 | |
| siliconflow | models-dev | Input: $0.05 | Output: $0.45 | Provider: SiliconFlow, Context: 131000, Output Limit: 8000 | |
| helicone | models-dev | Input: $0.04 | Output: $0.16 | Provider: Helicone, Context: 131072, Output Limit: 131072 | |
| fastrouter | models-dev | Input: $0.15 | Output: $0.60 | Provider: FastRouter, Context: 131072, Output Limit: 32768 | |
| cloudflareworkersai | models-dev | Input: $0.35 | Output: $0.75 | Provider: Cloudflare Workers AI, Context: 128000, Output Limit: 128000 | |
| cloudflareaigateway | models-dev | Input: $0.35 | Output: $0.75 | Provider: Cloudflare AI Gateway, Context: 128000, Output Limit: 16384 | |
| ovhcloud | models-dev | Input: $0.09 | Output: $0.47 | Provider: OVHcloud AI Endpoints, Context: 131000, Output Limit: 131000 | |
| synthetic | models-dev | Input: $0.10 | Output: $0.10 | Provider: Synthetic, Context: 128000, Output Limit: 32768 | |
| deepinfra | models-dev | Input: $0.05 | Output: $0.24 | Provider: Deep Infra, Context: 131072, Output Limit: 16384 | |
| submodel | models-dev | Input: $0.10 | Output: $0.50 | Provider: submodel, Context: 131072, Output Limit: 32768 | |
| nanogpt | models-dev | Input: $1.00 | Output: $2.00 | Provider: NanoGPT, Context: 128000, Output Limit: 8192 | |
| fireworksai | models-dev | Input: $0.15 | Output: $0.60 | Provider: Fireworks AI, Context: 131072, Output Limit: 32768 | |
| ionet | models-dev | Input: $0.04 | Output: $0.40 | Provider: IO.NET, Context: 131072, Output Limit: 4096 | |
| scaleway | models-dev | Input: $0.15 | Output: $0.60 | Provider: Scaleway, Context: 128000, Output Limit: 8192 | |
| cerebras | models-dev | Input: $0.25 | Output: $0.69 | Provider: Cerebras, Context: 131072, Output Limit: 32768 | |
| azureai | litellm | Input: $0.15 | Output: $0.60 | Source: azure_ai, Context: 131072 | |
| sambanova | litellm | Input: $3.00 | Output: $4.50 | Source: sambanova, Context: 131072 | |
| wandb | litellm | Input: $15,000.00 | Output: $60,000.00 | Source: wandb, Context: 131072 | |
| watsonx | litellm | Input: $0.15 | Output: $0.60 | Source: watsonx, Context: 8192 | |
| openrouter | openrouter | Input: $0.02 | Output: $0.10 | gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized to run on a single H100 GPU with native MXFP4 quantization. The model supports configurable reasoning depth, full chain-of-thought access, and native tool use, including function calling, browsing, and structured output generation. Context: 131072 |