Skip to main content

Introduction

Corti speech recognition and text generation are specifically designed for use in the healthcare domain. The automated speech recognition (ASR) language models are designed to balance recognition speed, performance, and accuracy. For text generation LLMs, factors that influence performance and quality of the outputs are: the quality of the ASR transcription, the input processed (transcript vs facts), as well as quality assuring mechanisms such as documentation guardrails and refining of facts.
The language codes listed below are used in API requests for both defining speech recognition language and document generation output language.You can query the API for document templates available by language. Learn more here.

Corti Speech Recognition Performance Tiers

This page describes a tier system to categorize functionality and performance that is available per language. Each ASR endpoint supports different capabilities as described here.
TierDescription
BaseAI-powered speech-to-text capability, ready to integrate with healthcare IT solutions via the /stream or /transcripts APIUp to 1,000
EnhancedBase plus optimized medical vocabulary for a variety of specialties and support for real-time dictation via the /transcribe API1,000-99,999
PremierEnhanced plus speech recognition models delivering the best performance in terms of accuracy, quality, and latency100,000+

Availability per Language

LanguageLanguage CodeASR Performance Tier
ArabicarBase
DanishdaEnhanced
DutchnlEnhanced
English (US)en / en-USPremier2
English (UK)en-GBPremier2
FrenchfrPremier2
GermandePremier2
HungarianhuEnhanced
ItalianitBase
NorwegiannoEnhanced
PortugueseptBase
SpanishesBase
SwedishsvEnhanced
Swiss-Germande-CH3Enhanced2
Notes:
1 Use the language codes listed above for the outputLanguage parameter in POST/documents requests. Template(s) or section(s) in the defined output must be available for successful document generation.
For workflows leveraging /stream for real-time transcript generation and fact extraction, please be aware that the outputLanguage for facts must also be supported by speech recognition.While there is general support for translation between English transcript to facts in other languages (e.g. German, French, Danish, etc.), additional translation language-pair combinations are not quality assessed or performance benchmarked at this time.
2 Speech recognition accuracy for async audio file processing via /transcripts endpoint may be degraded as compared to real-time recognition via the /transcribe and /stream endpoints. An update is expected November 2025 to address the performance issue.3 The Swiss-German (de-CH) speech recognition model is currently designed to transcribe dialectical Swiss-German; it will soon be disambiguated to better handle both dialectical Swiss-German (gsw-CH) and High Swiss-German (de-CH).

Please contact us if you are interested in a language that is not listed here, need help with tiers and endpoint definitions, or have questions about how to use language codes in API requests.
I