Llama 4 Scout 17B 16E Instruct

llama-4-scout

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Scout, a 17 billion parameter model with 16 experts. Served by DeepInfra.

Provider	Source	Input Price ($/1M)	Output Price ($/1M)	Description
vercel	vercel	Input: $0.08	Output: $0.30	The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding. Llama 4 Scout, a 17 billion parameter model with 16 experts. Served by DeepInfra.
together	together	Input: $0.18	Output: $0.59	-
helicone	models-dev	Input: $0.08	Output: $0.30	Provider: Helicone, Context: 131072, Output Limit: 8192
openrouter	openrouter	Input: $0.08	Output: $0.30	Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens. Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025. Context: 327680

Llama 4 Scout 17B 16E Instruct

Available at 4 Providers