llama-3.3-70b-instruct
Provider: Nvidia, Context: 128000, Output Limit: 4096
| Provider | Source | Input Price ($/1M) | Output Price ($/1M) | Description | Free |
|---|---|---|---|---|---|
| nvidia | models-dev | Input: $0.00 | Output: $0.00 | Provider: Nvidia, Context: 128000, Output Limit: 4096 | |
| githubmodels | models-dev | Input: $0.00 | Output: $0.00 | Provider: GitHub Models, Context: 128000, Output Limit: 32768 | |
| azure | models-dev | Input: $0.71 | Output: $0.71 | Provider: Azure, Context: 128000, Output Limit: 32768 | |
| helicone | models-dev | Input: $0.13 | Output: $0.39 | Provider: Helicone, Context: 128000, Output Limit: 16400 | |
| wandb | models-dev | Input: $0.71 | Output: $0.71 | Provider: Weights & Biases, Context: 128000, Output Limit: 32768 | |
| synthetic | models-dev | Input: $0.90 | Output: $0.90 | Provider: Synthetic, Context: 128000, Output Limit: 32768 | |
| nanogpt | models-dev | Input: $1.00 | Output: $2.00 | Provider: NanoGPT, Context: 128000, Output Limit: 8192 | |
| ionet | models-dev | Input: $0.13 | Output: $0.38 | Provider: IO.NET, Context: 128000, Output Limit: 4096 | |
| azurecognitiveservices | models-dev | Input: $0.71 | Output: $0.71 | Provider: Azure Cognitive Services, Context: 128000, Output Limit: 32768 | |
| llama | models-dev | Input: $0.00 | Output: $0.00 | Provider: Llama, Context: 128000, Output Limit: 4096 | |
| scaleway | models-dev | Input: $0.90 | Output: $0.90 | Provider: Scaleway, Context: 100000, Output Limit: 4096 | |
| azureai | litellm | Input: $0.71 | Output: $0.71 | Source: azure_ai, Context: 128000 | |
| deepinfra | litellm | Input: $0.23 | Output: $0.40 | Source: deepinfra, Context: 131072 | |
| hyperbolic | litellm | Input: $0.12 | Output: $0.30 | Source: hyperbolic, Context: 131072 | |
| metallama | litellm | Input: $0.00 | Output: $0.00 | Source: meta_llama, Context: 128000 | |
| nscale | litellm | Input: $0.20 | Output: $0.20 | Source: nscale, Context: N/A | |
| openrouter | openrouter | Input: $0.10 | Output: $0.32 | The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases and outperforms many of the available open source and closed chat models on common industry benchmarks. Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. [Model Card](https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md) Context: 131072 |