Skip to main content

General

Everything you need to know about using Hicap on a day-to-day basis
Hicap gives you fast, secure access to top closed AI models — like GPT-5 and Claude Sonnet 4.5 — at a fraction of the usual cost. We bulk-reserve compute from leading cloud providers and pass the savings to you, all through a simple API.
Reserved throughput locks in your compute and pricing — no rate limits, no surprise overages. Think of it like storing water when it’s cheap instead of paying full price from the tap during a drought. You commit to a volume up front and get a guaranteed rate, while pay-as-you-go charges you the spot price for every token. If your usage is predictable, reserved throughput saves you money and removes uncertainty.
Yes. If you’re not ready to commit to reserved capacity, you can use Hicap on a pay-per-use basis. You get the same API, the same models, and the same reliability — you just pay per token at standard rates. It’s a great way to get started or handle unpredictable workloads before locking in a reservation.
Typical savings are up to 25%. Your savings depend on your model usage blend, commitment length, and capacity needs.
We support the latest models across OpenAI, Anthropic, Google Gemini, and more. Available in AWS, Azure, Google Cloud, and top Neo Clouds like Alibaba — we try to be as flexible as our customers are. More models and clouds are coming soon; let us know what you need!
Start monthly! Scale up or down with your needs with no long-term lock-in. Want to pay even less? Sign up for a longer commitment for extra savings.
No stress. You can burst above your reservation as needed — just contact us, and we’ll help you scale up instantly. Your service stays smooth, even during traffic surges.
Yes. Hicap works great as one provider in a multi-provider setup. If you use an AI gateway (like LiteLLM, Portkey, or a custom router), you can configure Hicap as the preferred provider for specific models while keeping your existing provider accounts for others. Note that Hicap itself is an inference API — it doesn’t provide gateway or routing services. Your gateway handles the routing logic; Hicap handles the inference.

Security

Your trust and data security are our top priority
Yes. We use bank-grade encryption (AES-256) and secure communication protocols (TLS 1.3) to protect all data in transit and at rest. Importantly, Hicap does not store any customer messages or prompts sent to the LLM — your requests are forwarded to the model provider and responses are returned to you without being logged or retained on our side.
No. We never sell or share your data with third parties without your explicit consent. Your information is only used to provide our services.
Hicap follows strict incident-response protocols. If a breach were ever to occur, users would be notified immediately per our notification timeline, and we would work with affected customers on remediation steps. For full details on our obligations and response procedures, see our Terms of Service.
Absolutely. You can request data deletion at any time, and we will securely remove your information from our systems in compliance with GDPR and other applicable privacy regulations.
Yes. Hicap follows best practices such as encryption, role-based access control, and can be configured to meet compliance requirements like SOC 2 or GDPR.
You decide. Hicap gives you full control over which applications you integrate, while keeping internal data and processing private.
All traffic between your application and Hicap is encrypted with TLS 1.3. Hicap does not store your prompts or completions — requests are forwarded to the upstream model provider and responses are returned directly to you. Combined with encryption at rest for account and billing data, your sensitive information stays protected end-to-end.
Yes. Hicap can be installed within your AWS environment, ensuring that all processing and data requests remain private and isolated.

Architecture

How Hicap fits into your stack
Hicap sits between your application and the upstream model providers. Here’s the typical flow:
Your App ──▶ Hicap API ──▶ Model Provider
(OpenAI SDK)  api.hicap.ai  (OpenAI / Anthropic / Google / etc.)
  1. Your application calls api.hicap.ai/v1 using any standard OpenAI SDK
  2. Hicap authenticates the request, routes it to the optimal provider endpoint using your reserved capacity (or pay-as-you-go), and returns the response
  3. No prompts or completions are stored by Hicap — only usage metadata for billing
If you use an AI gateway, Hicap is configured as a provider upstream — the gateway routes to Hicap, and Hicap routes to the model provider.
Two things: the base URL and the API key. If you’re already using the OpenAI SDK, just set base_url to https://api.hicap.ai/v1 and use your Hicap API key. No other code changes are needed — the API is fully OpenAI-compatible.