Gemini 2.5 Flash Lite Preview 09-2025

gemini-2.5-flash-lite-preview-09-2025

Gemini 2.5 Flash-Lite is a balanced, low-latency model with configurable thinking budgets and tool connectivity (e.g., Google Search grounding and code execution). It supports multimodal input and offers a 1M-token context window.

Provider	Source	Input Price ($/1M)	Output Price ($/1M)	Description
vercel	vercel	Input: $0.10	Output: $0.40	Gemini 2.5 Flash-Lite is a balanced, low-latency model with configurable thinking budgets and tool connectivity (e.g., Google Search grounding and code execution). It supports multimodal input and offers a 1M-token context window.
google	models-dev	Input: $0.10	Output: $0.40	Provider: Google, Context: 1048576, Output Limit: 65536
googlevertex	models-dev	Input: $0.10	Output: $0.40	Provider: Vertex, Context: 1048576, Output Limit: 65536
vertex	litellm	Input: $0.10	Output: $0.40	Source: vertex, Context: 1048576
gemini	litellm	Input: $0.10	Output: $0.40	Source: gemini, Context: 1048576
openrouter	openrouter	Input: $0.10	Output: $0.40	Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, "thinking" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence. Context: 1048576

Gemini 2.5 Flash Lite Preview 09-2025

Available at 6 Providers