Skip to main content
Configure keyterms to bias speech-to-text output. This provides ability to introduce terms unfamiliar to the speech recognition models or to improve recognition reliability for homophones and words with ambiguous pronunciation. Especially useful for proper nouns (surnames, facilities/ location), unique organizational terminology, or terms otherwise new or unknown to the speech to text system.

Feature availability:

/transcribe

/streams

/transcripts

Looking to add a custom user dictionary to your dictation app? Use the keyterms configuration feature to define single words or multi-word phrases that should be recognized in speech-to-text output. See configuration details here.

How It Works

Currently, only the written form of a word needs to be specified as part of keyterms array to improve recognition accuracy. Keyterm configuration is case sensitive, meaning defined casing is preserved with the word bias. This is different than replacements, which are case insensitive.
Keyterm configuration is currently limited to 1,000 item per connection
A defined term is limited to a length of 50 characters
The API is designed for extensibility as more functionality is planned to be added for defined terms (e.g., category, pronunciation). This feature is also expected to work in conjunction with a vocabulary endpoint (under development), which will provide ability to lookup words already known to the system and bulk upload terms for recognition biasing.
Use of terms is optional for dictation configuration. Please contact us to report errors, or for more information on this feature.