Transcribe endpoint

Dictation, combining speech recognition with command-and-control capabilities, is a powerful tool for clinicians to drive EHR adoption, streamline documentation capture, and automate repetitive tasks. It can be available on desktop, web, and mobile applications and in a variety of languages.

This page explains key features available in Corti dictation enabled by the /transcribe API endpoint. Corti provides both a dictation API and web component, ready for you to integrate with your app.

Dictation can be supported by languages with Enhanced or Premier tier availability. See the languages page for the full list of support per language.

Feature availability per language

Language	Language Code	Formatting	Vocabulary
Danish	da	`coming soon`
Dutch	nl
English (US)	en		`coming soon`
English (UK)	en-GB	`coming soon`
French	fr	`coming soon`
German	de	`coming soon`
Hungarian	hu
Norwegian	no
Swedish	sv

Features

Click on the cards to learn more…

Languages

Corti speech recognition is specifically designed for use in the healthcare domain. A tier system has been introduced to categorize functionality and performance that is available per language and endpoint. Languages in the Enhanced and Premier tiers have the utmost functionality and recognition accuracy - they’re the ones recommended for dictation use.

Commands

Commands is a key functionality that brings the system beyond speech-to-text to a complete dictation solution. Put your users in the driver seat to control their workflow by defining commands to insert templates, navigate the application, automate repetitive tasks, and more!

Punctuation

Punctuation is essential for coherent documentation. Setting the spokenPunctuation parameter to true in /transcribe configuration enables users to control when punctuation is inserted in the document output. Additionally, the parameter automaticPunctuation can be used to have the AI model add periods and commas as appropriate.

Diarization

Diarization is the process of segmenting an audio recording by speaker, assigning portions of speech to distinct identities (e.g., “Doctor,” “Patient”). This enables accurate transcription, attribution, and analysis of multi-speaker clinical conversations, but is not required for effective AI scribing or workflow speech-enablement.

Formatting

beta
Speech recognition can be used to create a verbatim transcript of the audio; however, some content is not documented in the same manner as it is verbalized. The formatting features provide control over how key information should for represented in the textual output.

Vocabulary

coming soon
Unparalleled access to and control over the vocabulary used by the speech recognition models will give organizations the utmost control over the dictation experience: Gain visibility into the terminologies the models are trained on and update the vocabulary as needed to optimize for localized or specialized needs, respond to reported issues, and stay ahead of the wave of changes to medical practices and communication.

Dictation workflows

Hold-to-talk

Most common for dictation using a handheld microphone, providing the ultimate control over turning the microphone on (press and hold the record button) and off (release the button).

Toggle-to-talk

Most common for dictation using a wearable or desktop microphone, where the microphone is turned on and remains in active recording state until it is turned off.

Please contact us for more information or help.

Get Started

Speech Recognition

Text Generation

Agentic Framework (beta)

Administration

Feature availability per language

Features

Languages

Commands

Punctuation

Diarization

Formatting

Vocabulary

Dictation workflows

Hold-to-talk

Toggle-to-talk

Get Started

Speech Recognition

Text Generation

Agentic Framework (beta)

Administration

​Feature availability per language

​Features

Languages

Commands

Punctuation

Diarization

Formatting

Vocabulary

​Dictation workflows

Hold-to-talk

Toggle-to-talk

Feature availability per language

Features

Dictation workflows