Skip to main content

gemini-2.5-pro

Extended version of Gemini 2.5 Pro with ultra-long context support (beyond 200k tokens, reaching into the millions depending on setup). Designed to handle large documents, full repositories, or extensive datasets in a single session.
Best for:
Long-context processing: legal contracts spanning thousands of pages, large codebases, academic research, reviewing historical chat/log data. Ideal for copilots that need to “remember” or reason over very large corpora.
InputOutput
Text, Image, Audio, Video, CodeText
Information provided from Gemini website.

gemini-2.5-flash

Gemini 2.5 Flash is a fast and cost-effective model that balances performance with a wide range of capabilities. It is the first Flash model to feature thinking capabilities, which lets you see the model’s thinking process as it generates a response.
Best for:
Low latency/high throughput for chat, summarization, and multimodal extraction at a lower cost.
InputOutput
Text, Image, Audio, Video, CodeText, Image, Audio, Video, Code
This model offers
  • * 1,048,576 context window
  • * 65,536 max output tokens
  • * January 1, 2025 knowledge cutoff
Information provided from Gemini website.

gemini-2.5-flash-image

Gemini 2.5 Flash Image (Also known as “Nano Banana”) is an efficient multimodal model optimized for high-quality image generation and editing. Built on the Gemini 2.5 Flash architecture, it enables seamless creation, enhancement, and fusion of visual content directly from text or mixed inputs. The model preserves visual coherence across edits and includes built-in SynthID watermarking for responsible image generation.
Best for:
Fast and cost-effective text-to-image generation, iterative image editing, and visual asset creation for marketing, design, and creative workflows.
InputOutput
Text, ImageImage
Information provided from Gemini website.

gemini-2.5-flash-lite

Lightweight, optimized version of Gemini Flash, designed for speed and low cost. Trades off some reasoning depth and output quality for efficiency. Maintains multimodal capabilities (text + image input) but focuses on ultra-fast responses.
Best for:
High-throughput, latency-sensitive tasks: chatbots with large user volumes, quick autocomplete, real-time customer support, fast retrieval-augmented generation (RAG), and mobile/embedded use cases where cost and speed matter more than depth.
InputOutput
Text, Image, AudioText, Audio
This model offers
  • * 1,000,000 context window
  • * 64,000 max output tokens
  • * January 1, 2025 knowledge cutoff
Information provided from Gemini website.

gemini-2.0-flash

A Google model that performs like a Pro model with the speed of a Flash model. Multi-modal inputs and outputs, native tool use, great for agentic workflows.
Best for:
High-volume workflows: customer interaction, summarization, data extraction, and fast-response agents. Good when you need speed with decent reasoning.
InputOutput
Text, Image, Audio, Video, CodeText, Image, Audio
This model offers
  • * 1,000,000 context window
  • * 8,000 max output tokens
  • * August 1, 2024 knowledge cutoff
Information provided from Gemini website.

gemini-2.0-flash-lite

Ultra-light, lowest-latency Gemini variant. Optimized for efficiency rather than depth, while maintaining multimodal support.
Best for:
Mobile or embedded copilots, lightweight RAG, chatbots, autocomplete, and fast contextual lookups where cost and response time are more critical than complex reasoning.
InputOutput
Text, Image, AudioText
This model offers
  • * 1,000,000 context window
  • * 8,000 max output tokens
  • * February 5, 2025 knowledge cutoff
Information provided from Gemini website.