Google

gemini-3-pro-preview

Gemini 3 Pro Preview is a next-generation, high-capability model focused on advanced reasoning, deep multimodal understanding, and long-context performance. As a preview release, it showcases cutting-edge improvements in analytical depth, instruction comprehension, and complex problem solving across text and multimodal inputs.

Best for:

Advanced reasoning, complex multimodal workflows, research-oriented tasks, and early adoption scenarios where maximum capability and depth are prioritized over cost or latency.

Input		Output
`Text, Image, Audio, Video, PDF`		`Text`

This model offers

* 1,000,000 context window
* 64,000 max output tokens

Information provided from Gemini website.

gemini-3-flash-preview

Gemini 3 Flash is a fast and cost-efficient multimodal model designed for responsive interactions and scalable workloads. It balances solid reasoning and multimodal understanding with low latency, making it well suited for high-throughput applications.

Best for:

Low-latency, high-throughput chat, summarization, and multimodal extraction in production environments where speed and cost efficiency are critical.

Input		Output
`Text, Image`		`Text`

Information provided from Gemini website.

gemini-2.5-pro

Extended version of Gemini 2.5 Pro with ultra-long context support (beyond 200k tokens, reaching into the millions depending on setup). Designed to handle large documents, full repositories, or extensive datasets in a single session.

Best for:

Long-context processing: legal contracts spanning thousands of pages, large codebases, academic research, reviewing historical chat/log data. Ideal for copilots that need to “remember” or reason over very large corpora.

Input		Output
`Text, Image, Audio, Video, Code`		`Text`

Information provided from Gemini website.

gemini-2.5-flash

Gemini 2.5 Flash is a fast and cost-effective model that balances performance with a wide range of capabilities. It is the first Flash model to feature thinking capabilities, which lets you see the model’s thinking process as it generates a response.

Best for:

Low latency/high throughput for chat, summarization, and multimodal extraction at a lower cost.

Input		Output
`Text, Image, Audio, Video, Code`		`Text, Image, Audio, Video, Code`

This model offers

* 1,048,576 context window
* 65,536 max output tokens
* January 1, 2025 knowledge cutoff

Information provided from Gemini website.

gemini-2.5-flash-lite

Lightweight, optimized version of Gemini Flash, designed for speed and low cost. Trades off some reasoning depth and output quality for efficiency. Maintains multimodal capabilities (text + image input) but focuses on ultra-fast responses.

Best for:

High-throughput, latency-sensitive tasks: chatbots with large user volumes, quick autocomplete, real-time customer support, fast retrieval-augmented generation (RAG), and mobile/embedded use cases where cost and speed matter more than depth.

Input		Output
`Text, Image, Audio`		`Text, Audio`

This model offers

* 1,000,000 context window
* 64,000 max output tokens
* January 1, 2025 knowledge cutoff

Information provided from Gemini website.

gemini-2.0-flash

A Google model that performs like a Pro model with the speed of a Flash model. Multi-modal inputs and outputs, native tool use, great for agentic workflows.

Best for:

High-volume workflows: customer interaction, summarization, data extraction, and fast-response agents. Good when you need speed with decent reasoning.

Input		Output
`Text, Image, Audio, Video, Code`		`Text, Image, Audio`

This model offers

* 1,000,000 context window
* 8,000 max output tokens
* August 1, 2024 knowledge cutoff

Information provided from Gemini website.

gemini-2.0-flash-lite

Ultra-light, lowest-latency Gemini variant. Optimized for efficiency rather than depth, while maintaining multimodal support.

Best for:

Mobile or embedded copilots, lightweight RAG, chatbots, autocomplete, and fast contextual lookups where cost and response time are more critical than complex reasoning.

Input		Output
`Text, Image, Audio`		`Text`

This model offers

* 1,000,000 context window
* 8,000 max output tokens
* February 5, 2025 knowledge cutoff

Information provided from Gemini website.

Get Started

Providers

gemini-3-pro-preview

gemini-3-flash-preview

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash-lite

gemini-2.0-flash

gemini-2.0-flash-lite

Get Started

Providers

​gemini-3-pro-preview

​gemini-3-flash-preview

​gemini-2.5-pro

​gemini-2.5-flash

​gemini-2.5-flash-lite

​gemini-2.0-flash

​gemini-2.0-flash-lite

gemini-3-pro-preview

gemini-3-flash-preview

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash-lite

gemini-2.0-flash

gemini-2.0-flash-lite