Gemini 2.5 Flash-Lite, previously in preview, is now stable and generally available. This cost-efficient model provides high quality in a small size, and includes 2.5 family features like a 1 million-token context window and multimodality.
Related items
AIGoogle DeepMind
Gemini 3.1 Flash-Lite: Built for intelligence at scale
Gemini 3.1 Flash-Lite is our fastest and most cost-efficient Gemini 3 series model yet.
AIGoogle DeepMind
Introducing the Gemini 2.5 Computer Use model
Available in preview via the API, our Computer Use model is a specialized model built on Gemini 2.5 Pro’s capabilities to power agents that can interact with user interfaces.
AIHugging Face Blog
From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels
AIGoogle DeepMind
Gemini 3 Flash: frontier intelligence built for speed
Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.
AIGoogle AI
Reduce friction and latency for long-running jobs with Webhooks in Gemini API
Gemini API
AIGoogle DeepMind
Gemini 3.1 Flash Live: Making audio AI more natural and reliable
Our latest voice model has improved precision and lower latency to make voice interactions more fluid, natural and precise.