gemini-2.5-pro
Extended version of Gemini 2.5 Pro with ultra-long context support (beyond 200k tokens, reaching into the millions depending on setup). Designed to handle large documents, full repositories, or extensive datasets in a single session.Best for:
Long-context processing: legal contracts spanning thousands of pages, large codebases, academic research, reviewing historical chat/log data. Ideal for copilots that need to “remember” or reason over very large corpora.
| Input | Output | |
|---|---|---|
Text, Image, Audio, Video, Code | Text |
Information provided from Gemini website.
gemini-2.5-flash
Gemini 2.5 Flash is a fast and cost-effective model that balances performance with a wide range of capabilities. It is the first Flash model to feature thinking capabilities, which lets you see the model’s thinking process as it generates a response.Best for:
Low latency/high throughput for chat, summarization, and multimodal extraction at a lower cost.
| Input | Output | |
|---|---|---|
Text, Image, Audio, Video, Code | Text, Image, Audio, Video, Code |
This model offers
- * 1,048,576 context window
- * 65,536 max output tokens
- * January 1, 2025 knowledge cutoff
gemini-2.5-flash-image
Gemini 2.5 Flash Image (Also known as “Nano Banana”) is an efficient multimodal model optimized for high-quality image generation and editing. Built on the Gemini 2.5 Flash architecture, it enables seamless creation, enhancement, and fusion of visual content directly from text or mixed inputs. The model preserves visual coherence across edits and includes built-in SynthID watermarking for responsible image generation.Best for:
Fast and cost-effective text-to-image generation, iterative image editing, and visual asset creation for marketing, design, and creative workflows.
| Input | Output | |
|---|---|---|
Text, Image | Image |
Information provided from Gemini website.
gemini-2.5-flash-lite
Lightweight, optimized version of Gemini Flash, designed for speed and low cost. Trades off some reasoning depth and output quality for efficiency. Maintains multimodal capabilities (text + image input) but focuses on ultra-fast responses.Best for:
High-throughput, latency-sensitive tasks: chatbots with large user volumes, quick autocomplete, real-time customer support, fast retrieval-augmented generation (RAG), and mobile/embedded use cases where cost and speed matter more than depth.
| Input | Output | |
|---|---|---|
Text, Image, Audio | Text, Audio |
This model offers
- * 1,000,000 context window
- * 64,000 max output tokens
- * January 1, 2025 knowledge cutoff
gemini-2.0-flash
A Google model that performs like a Pro model with the speed of a Flash model. Multi-modal inputs and outputs, native tool use, great for agentic workflows.Best for:
High-volume workflows: customer interaction, summarization, data extraction, and fast-response agents. Good when you need speed with decent reasoning.
| Input | Output | |
|---|---|---|
Text, Image, Audio, Video, Code | Text, Image, Audio |
This model offers
- * 1,000,000 context window
- * 8,000 max output tokens
- * August 1, 2024 knowledge cutoff
gemini-2.0-flash-lite
Ultra-light, lowest-latency Gemini variant. Optimized for efficiency rather than depth, while maintaining multimodal support.Best for:
Mobile or embedded copilots, lightweight RAG, chatbots, autocomplete, and fast contextual lookups where cost and response time are more critical than complex reasoning.
| Input | Output | |
|---|---|---|
Text, Image, Audio | Text |
This model offers
- * 1,000,000 context window
- * 8,000 max output tokens
- * February 5, 2025 knowledge cutoff