Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.hicap.ai/llms.txt

Use this file to discover all available pages before exploring further.

All OpenAI models below are accessible through the Hicap API using the standard OpenAI API spec. Point your OpenAI SDK at https://api.hicap.ai/v1 and use any model ID listed here. For pricing, see the Model Catalog.

gpt-5.5

GPT-5.5 is OpenAI’s latest flagship model available through the Hicap API. Use the gpt-5.5 model ID in OpenAI-compatible endpoints for top-tier general-purpose reasoning, coding, analysis, and multimodal workloads.
Best for:
Advanced reasoning, agentic coding, complex analysis, multimodal applications, and production assistants that need the strongest general-purpose OpenAI model available through Hicap.
InputOutput
Text, imageText
This model offers

gpt-5.5-pro

GPT-5.5 Pro is the pro-tier GPT-5.5 model available through the Hicap API. Use the gpt-5.5-pro model ID when your workload needs the highest capability option in the GPT-5.5 family.
Best for:
Deep research, demanding enterprise analysis, high-stakes coding and review workflows, and complex reasoning tasks where quality matters more than speed.
InputOutput
Text, imageText
This model offers

gpt-5.4

GPT-5.4 is OpenAI’s most advanced model, offering a massive 1.05M token context window with state-of-the-art reasoning, coding, and multimodal capabilities. It features long-context tiers above 272K tokens for cost-efficient processing of very large inputs.
Best for:
Ultra-long-context analysis, full-codebase reasoning, massive document processing, and demanding enterprise workflows that require the highest intelligence with extended context.
InputOutput
Text, imageText
This model offers
  • * 1,050,000 context window
  • * Long-context tier above 272,000 tokens

gpt-5.3-chat-latest

GPT-5.3 Chat Latest is a conversationally optimized model from the GPT-5.3 family, designed for responsive dialogue and instruction fidelity. It supports a 200K token context window and delivers strong performance across chat-based applications.
Best for:
Interactive chat applications, conversational assistants, customer support, and real-time dialogue where responsiveness and instruction following are critical.
InputOutput
Text, imageText
This model offers
  • * 200,000 context window

gpt-5.3-codex

GPT-5.3 Codex is a code-optimized variant from the GPT-5.3 family, purpose-built for software development workflows. It excels at code generation, editing, debugging, and agentic coding tasks with a 200K token context window.
Best for:
Agentic coding, code generation, debugging, refactoring, automated development workflows, and IDE integrations that require deep code understanding.
InputOutput
Text, CodeText, Code
This model offers
  • * 200,000 context window

gpt-5.2

GPT-5.2 is a highly capable general-purpose language model designed for advanced reasoning, long-context understanding, and instruction fidelity. It delivers stable, coherent outputs across complex tasks and extended conversations, while supporting multimodal inputs for richer and more contextual responses.
Best for:
Complex reasoning, long-form generation, decision support, and multimodal chat experiences where accuracy, consistency, and instruction adherence are critical.
InputOutput
Text, imageText
This model offers
  • * 400,000 context window
  • * 128,000 max output tokens
  • * August 31, 2025 knowledge cutoff

gpt-5.2-chat-latest

GPT-5.2 Chat is a conversationally optimized variant of GPT-5.2, designed to deliver natural, responsive, and context-aware dialogue. It emphasizes turn-by-turn coherence, low-latency interactions, and strong instruction following in chat-based environments, while retaining advanced reasoning and multimodal understanding.
Best for:
Interactive chat applications, assistants, customer support, and real-time conversational experiences where responsiveness, conversational flow, and contextual continuity are essential.
InputOutput
Text, imageText
This model offers
  • * 128,000 context window
  • * 16,384 max output tokens
  • * August 31, 2025 knowledge cutoff

gpt-5.1

GPT-5.1 is a reliable and efficient language model that offers strong general-purpose capabilities with balanced reasoning, instruction following, and text generation. It provides consistent performance across common NLP tasks while prioritizing stability and cost efficiency.
Best for:
General-purpose text generation, summarization, content drafting, and standard chat or assistant workloads where reliability and efficiency are more important than maximum reasoning depth.
InputOutput
Text, imageText
This model offers
  • * 400,000 context window
  • * 128,000 max output tokens
  • * September 30, 2024 knowledge cutoff

gpt-5.1-chat-latest

GPT-5.1 Chat is a conversationally tuned language model optimized for responsive, natural dialogue and consistent instruction following. It builds on GPT-5.1’s stable core capabilities while prioritizing low-latency interactions and smooth conversational flow in chat-based applications.
Best for:
Everyday chat assistants, customer support, guided interactions, and conversational interfaces where speed, clarity, and reliability are the primary requirements.
InputOutput
Text, imageText
This model offers
  • * 128,000 context window
  • * 16,384 max output tokens
  • * September 30, 2024 knowledge cutoff

gpt-5

Full-scale flagship OpenAI model. Delivers state-of-the-art reasoning, creativity, coding, and multimodal support. Handles long contexts (hundreds of thousands of tokens) and complex workflows.
Best for:
Enterprise-grade copilots, advanced software development, strategic research, product/financial/legal analysis, high-quality multimodal content generation. Best when you need maximum depth and reliability.
InputOutput
Text, imageText
This model offers
  • * 400,000 context window
  • * 128,000 max output tokens
  • * September 30, 2024 knowledge cutoff

gpt-5-chat

This use case employs OpenAI’s foundational GPT-5 model (which is the default model in the ChatGPT interface). It is a general-purpose multimodal reasoning engine designed to understand, generate, and synthesize complex information in a coherent and contextually relevant manner, simulating a fluid human conversation without requiring a specific “chat” model.
Best for:
General conversational interaction and multimodal reasoning. Ideal for answering questions, generating creative and explanatory content, summarizing documents, translating languages, analyzing images, and serving as a versatile virtual assistant across a wide range of non-specialized tasks.
InputOutput
Text, imageText, Image, Code
This model offers
  • * 400,000 context window
  • * 128,000 max output tokens
  • * September 30, 2024 knowledge cutoff

gpt-5-mini

A lighter and faster variant of GPT-5. Preserves strong reasoning and coding ability, but optimized for lower latency and cost. Good balance between performance and efficiency.
Best for:
A lighter and faster variant of GPT-5. Preserves strong reasoning and coding ability, but optimized for lower latency and cost. Good balance between performance and efficiency.
InputOutput
Text, ImageText
This model offers
  • * 400,000 context window
  • * 128,000 max output tokens
  • * May 30, 2024 knowledge cutoff

gpt-5-nano

Ultra-light, low-latency model. Prioritizes speed and efficiency over advanced reasoning. Best for lightweight tasks where response time and throughput matter more than nuance.
Best for:
Real-time assistants, mobile/edge use cases, autocomplete, customer service chat, high-volume RAG queries. Best when cost + speed > deep reasoning.
InputOutput
Text, ImageText
This model offers
  • * 400,000 context window
  • * 128,000 max output tokens
  • * May 30, 2024 knowledge cutoff

gpt-4o

OpenAI’s high-intelligence flagship model for complex, multi-step tasks. GPT-4o is cheaper and faster than GPT-4 Turbo
Best for:
Multimodal copilots (voice, image, video), real-time assistants, design/code reviews, customer-facing apps, and enterprise agents that require both contextual reasoning + speed.
InputOutput
Text, Image, Audio, Video, CodeText, Audio, Image (limited)
This model offers
  • * 128,000 context window
  • * 16,384 max output tokens
  • * October 1, 2023 knowledge cutoff

gpt-4o-mini

OpenAI’s affordable and intelligent small model for fast, lightweight tasks. GPT-4o mini is cheaper and more capable than GPT-3.5 Turbo
Best for:
Scalable chatbots, QA systems, RAG pipelines, lightweight coding assistants, and UX/product copilots where cost efficiency + responsiveness matter.
InputOutput
Text, Image, CodeText
This model offers
  • * 128,000 context window
  • * 16,384 max output tokens
  • * October 1, 2023 knowledge cutoff

gpt-4.1

GPT-4.1 is the flagship model of the new 4.1 family, excelling in coding, following complex instructions, and managing long contexts (up to 1 million tokens), making it the powerful go-to choice for demanding applications.
Best for:
Advanced data analysis, high-fidelity content creation, software architecture design, and analytical copilots requiring maximum reliability and depth.
InputOutput
Text, CodeText
This model offers
  • * 1,000,000 context window
  • * 32,000 max output tokens
  • * June 1, 2024 knowledge cutoff

gpt-4.1-mini

GPT-4.1 mini is optimized for speed and efficiency, being 40% faster then GPT-4o. GPT-4.1 mini offers a balanced approach with robust performance for simpler or time-sensitive tasks, providing faster response times compared to its more feature-rich counterpart.
Best for:
Mid-complexity workflows: documentation QA, data summarization, low-latency coding copilots, or iterative product feedback systems.
InputOutput
Text, CodeText
This model offers
  • * 1,000,000 context window
  • * 32,000 max output tokens
  • * June 1, 2024 knowledge cutoff

gpt-4.1-nano

GPT-4.1 nano is the smallest, fastest, and most cost-effective option of the GPT-4.1 family. GPT-4.1 nano is ideal for high-volume applications such as autocomplete, classification, and extracting details from lengthy documents while maintaining a strong performance profile.
Best for:
Autocomplete, message classification, semantic search ranking, basic natural-language logic tasks, and embedded AI tools. Ideal for speed-critical micro-agents.
InputOutput
TextText
This model offers
  • * 1,000,000 context window
  • * 32,000 max output tokens
  • * June 1, 2024 knowledge cutoff