Google: Gemini 2.5 Flash-Lite

Released Recently|1M context|$0.10/M input tokens|$0.40/M output tokens

Gemini 2.5 Flash-Lite is the fastest and lowest cost model in the Gemini 2.5 family, designed to push the frontier of intelligence per dollar. Built for high-throughput applications requiring low latency, it excels at classification, translation, and summarization at scale. With native reasoning capabilities that can be optionally toggled on, it provides flexibility for both simple and complex tasks while maintaining exceptional speed.

Speed
Intelligence
Price (1M Tokens)$0.10
Inputs
TextImage
Outputs
Text

Pricing

Input
$0.10
Cached input
N/A
Output
$0.40
Gemini 2.5 Flash-Lite
$0.10
GPT-4o
$2.50
GPT-4.1
$2.00

Use Case

High-volume classification tasks
Advanced AI capability for specialized tasks
Real-time translation at scale
Advanced AI capability for specialized tasks
Document summarization pipelines
Advanced AI capability for specialized tasks
Content moderation and filtering
Produce human-quality content with exceptional writing abilities
Quick data extraction and parsing
Advanced AI capability for specialized tasks
Batch processing operations
Advanced AI capability for specialized tasks

Features

Ultra-Fast Performance
Lower latency than both 2.0 Flash-Lite and 2.0 Flash
Cost Efficiency
Lowest cost in the Gemini 2.5 family
Multimodal Input
Process text, images, video, audio, and PDFs
1M Token Context
Process up to 1 million tokens in a single request
Native Reasoning
Optional thinking mode for complex tasks
High Throughput
Optimized for scale and parallel processing