Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.hicap.ai/llms.txt

Use this file to discover all available pages before exploring further.

All ElevenLabs models below are accessible through the Hicap API. ElevenLabs provides industry-leading text-to-speech (TTS) and speech-to-text (STT) capabilities. For pricing, see the Model Catalog.

Text-to-Speech Models

eleven_multilingual_v2

Eleven Multilingual v2 is ElevenLabs’ flagship multilingual text-to-speech model, supporting 29 languages with natural-sounding, expressive voice synthesis. It delivers high-quality speech with nuanced prosody and emotional range.
Best for:
Multilingual voice applications, audiobook narration, content localization, accessibility features, and customer-facing voice experiences where natural speech quality is essential.
InputOutput
TextAudio
This model offers
  • * 10,000 character context window
  • * 29 language support

eleven_v3

Eleven v3 is the latest generation ElevenLabs TTS model, delivering improved voice quality, faster generation, and enhanced expressiveness in a compact context window.
Best for:
Real-time voice assistants, interactive applications, short-form content, and latency-sensitive voice experiences where fast generation matters.
InputOutput
TextAudio
This model offers
  • * 5,000 character context window

Speech-to-Text Models

scribe_v1

Scribe v1 is ElevenLabs’ speech-to-text transcription model, supporting 90+ languages with accurate transcription of spoken audio into text.
Best for:
Audio transcription, meeting notes, podcast indexing, subtitle generation, and voice-to-text workflows across multiple languages.
InputOutput
AudioText
This model offers
  • * 90+ language support

scribe_v2

Scribe v2 is the latest generation of ElevenLabs’ transcription model, offering improved accuracy and language coverage over Scribe v1.
Best for:
High-accuracy transcription, professional media workflows, real-time captioning, and enterprise audio processing where transcription quality is critical.
InputOutput
AudioText
This model offers
  • * 90+ language support