> ## Documentation Index > Fetch the complete documentation index at: https://docs.corti.ai/llms.txt > Use this file to discover all available pages before exploring further. # Speech to Text > Updates and improvements to Corti speech to text Detailed documentation about Speech to Text is available [here](/stt/overview), and supported languages [here](/stt/languages/). ### Language expansion and updates * Expansion of `Base` tier to add support for 27 new languages * Languages upgraded from `Base` to `Enhanced` tier - Spanish, Arabic * Expansion of `Enhanced` tier to add support for new language - Finnish * Expansion of `Premier` tier to add support for new language - English (AU) See full details on the [languages page](/stt/languages), including ability to search by language (name or code) and filter by performance tier. Corti values the opportunity to expand to new markets, but we need your collaboration and partnership in speech-to-text validation to expand language support from base to enhanced or premier tiers. [Contact us](https://help.corti.app) to learn more. ### Replacements and Keyterms now available on Streams and Transcripts endpoints The features [replacements](/stt/replacements) and [keyterms](/stt/keyterms) are now supported on all three STT endpoints - `transcribe`, `streams`, and `transcripts`. Additionally, some configuration parameters have been updated on the [streams](/api-reference/streams) and [create transcripts](/api-reference/transcripts/create-transcript) endpoints for improved consistency. No timeline is currently planned to remove support for the deprecated parameters. See details in [upcoming changelog](/release-notes/changelog-upcoming#2026-06-16). ### Maintenance * Ensure replacements correctly handle casing and punctuation * Fix capitalization of first word in transcript (capitalize) and after colon and semicolon (lowercase) * Update formatting option `locale:short` to use four-digit year (e.g., 2026) * Increase maximum audio file duration accepted for [upload recordings](/api-reference/recordings/upload-recording) from 60 to 120 minutes ### Keyterm bias New feature for `transcribe` web socket endpoint now available - define `keyterms` the system should be aware of and biased towards to ensure proper nouns unknown by the STT model are recognized and improve recognition reliability for other general terms. See feature details [here](/stt/keyterms). ### Swiss French `fr-CH` now supported Language code `fr-CH` is now supported for use of Swiss French speech to text. Beyond current French speech to text handling, this model has improved support for Swiss medications and returns punctuation without non-breaking spaces (as used in French (`fr` or `fr-FR`) documentation). The [languages page](/stt/languages) has been updated, and detailed information for punctuation handling on text insertion is available [here](/stt/best-practices-transcribe#locale-specific-spacing). ### Commands with wildcard variable New dictation feature, **wildcard variable commands** now available. Unlike `enum` command variables, `wildcard` variables provide the ability to recognize a command based on undefined, open-ended text. A literal trigger word, such as “select”, is required before defining a wildcard variable. See feature details with examples [here](/stt/commands#commands-with-wildcard-variables) and configuration in the [API reference](/api-reference/transcribe#param-variables). ### Updated asynchronous method for audio file processing The `/transcripts` endpoint provides batch audio file processing using a synchronous-to-asynchronous method, where requests process synchronously for 25 seconds before hitting timeout and continuing asynchronously. A new request parameter, `async`, is now available to force asynchronous processing. This method returns the 202 status response right away, instead of waiting the full timeout window, so that clients can poll status endpoint and get full transcript when it is completed. See more details about `/transcripts` [here](/stt/transcripts) and the updated parameter in the [api-reference](/api-reference/transcripts/create-transcript#body-async). ### Replacement Rules New dictation feature, **replacement rules**, now available. Define terms to "find" and "replace" in STT output using the `replacements` configuration option in `/transcribe` requests. See details and examples [here](/stt/replacements) and API configuration [here](/api-reference/transcribe#param-replacements). Replacements configuration is limited to 1,000 items per connection. ### Bug fixes * "D3" now correctly handled * Updated measurement formatting so that "0°" is supported ### Raw PCM audio now supported Audio processing has been updated so that raw PCM audio can be sent over web socket secure (wss) audio streaming with `/transcribe` and `/streams`. When using raw PCM audio, `audioFormat` must be declared in configuration. See details [here](/stt/audio#raw-audio). ### Infrastructure update for `/transcripts` endpoint to improve speech-to-text performance for audio file processing Infrastructure updates to improve performance of audio file processing: * Updates to batch sync-to-async audio file processing to ensure consistent latency (RTFx 100) * Model updates to reduce risk of hallucinations while improving medical term recall (English, German, French, Danish) * Ensure efficient GPU utilization, refine load balancing and partition management (enhance autoscaling of ASR services) No client action is necessary to select a specific model version or use the new architecture. See full details on creating transcripts from audio files [here](/api-reference/transcripts/create-transcript). ### New `/languages` endpoint now available New REST endpoint available for clients to programmatically retrieve speech to text languages available per endpoint. See [GET/languages/](/api-reference/languages/list-languages). ### Updated German and French STT models released Update STT models for German (`de`, `de-CH`) and French (`fr`) have been released for improved medical term recall and WER performance. This release focused on general terminology recognition, medication names, and punctuation handling. ### Audio Health Events A new feature for real time audio streaming, **audio health events** are notifications from speech to text system when audio quality may be compromised or problematic. The feature can be enabled with boolean parameter in `transcribe` and `streams` configuration. The following two scenarios are supported in this initial release: * **Speech Quality Issue Detected** - triggered by background sounds, white noise, or compromised audio. * **Long Silence Detected** - defined as 10s of continuous audio without any sound/noise. Additional functionality is expected in subsequent releases, as well as ability to define threshold for alerting in the API configuration. Learn more about `audioEvents` [here](/stt/audio-events). ### Command and formatting updates * Added timeout to return formatted text on silence * Added non-breaking space before French punctuation characters * Added support for square meter (m²) and square centimeter (cm²) units * Fixed handling of decimal number format so that negative numbers are supported * Security hardening: Improved validation of user-supplied command phrase inputs to prevent malformed values from affecting processing ### Updated French speech to text models now available French speech to text model (language code `fr`) has been updated for improved medical term recall and conversational, far field audio support. ### Transcript output formatting now supported on Streams endpoint [Formatting](/stt/formatting) is now supported on the `streams` endpoint, with server-defined default values applied automatically. At this time, API configuration of formatting options is not supported for `streams` as it is for `transcribe` endpoint. Response messages for configuration errors have been clarified. See details [here](/api-reference/transcribe#5-error-handling) and [here](/api-reference/streams#5-error-handling). ### Improved diarization now available Updates to Corti's Speech to Text models and diarizer, in which reliability and precision for speaker separation and identification have been optimized, have been released. See details on this feature [here](/stt/diarization) and configuration recommendations for `streams` and `transcripts` requests [here](/stt/audio#channel-configuration). ### Formatting updates and improvements * Add formatting for more units: "g/dL", "cL", "hL", "pL", "mg/L", "mg/kg", "U/mL", "mL/min", * Add support for "times two" -> "x2" and "two plus" -> "2+" patterns (1 through 10) * Fix handling of German spaced number variations ### Next-generation speech recognition models and infrastructure New and improved model architecture and infrastructure bring the following actions and benefits: | Actions | Benefits | | ------------------------------- | -------------------------------------------------------------------------------------- | | **Optimize STT performance** | Maximize accuracy of transcripts and diarization while minimizing hallucination risk | | **Refine system configuration** | Improve (reduce) latency, augment formatting, and re-introduce interim results feature | | **Infrastructure resilience** | Improve system scalability and GPU resource utilization | STT models for the following languages have been updated on both `/transcribe` and `/streams` endpoints: | Language | Language Code | | ------------ | ------------- | | Danish | `da` | | English | `en`, `en-GB` | | French | `fr` | | German | `de` | | Swiss German | `de-CH` | Beyond defining the desired language code in API requests, no client action is necessary to select a specific version of a model. Please [contact us](mailto:help@corti.ai) for support or further information. ### Interim Results now available during real-time dictation for low latency transcript previews Use the `interimResults` configuration parameter in `/transcribe` real-time dictation requests to have transcript previews returned from the server at a faster rate than final transcripts. See full feature details [here](/stt/interim-results), parameter configuration details [here](/api-reference/transcribe#param-interim-results), and availability by language [here](/stt/transcribe). ### Locale-based formatting for dates, times, and numbers Dictation formatting can now apply proper styling based on local standards for dates, times, and numbers. Locale is determined based on the `primaryLanguage` defined in the web socket configuration. See full formatting details per language [here](/stt/formatting) and parameter configuration details [here](/api-reference/transcribe#param-formatting). Note that legacy date parameter values are still supported by the API; no breaking changes were implemented. ### Dictation formatting improvements * Expand units handling * Expand acronym handling * Improve vertebrae formatting: Support for both `letter number` and `letter number letter number` patterns (e.g., "L three" -> "L3" and "L three L four" -> "L3-L4") * Improve TNM and cancer stage formatting: Support for `T number N number M number` and `stage number` patterns (e.g., "T two N one M one a" -> "T2 N1 M1a" and "stage two b" -> "Stage IIB") * Improve percentage handling: Move handling from `spokenPunctuation` to `formatting` so that percent symbol is only returned when there is a number measurement (e.g., dictation, "What *percent* sure are you question mark *eighty percent*" -> "What *percent* sure are you? *80%*") * Expand handling of German dates: Years without spaces (e.g. "zweitausendsechsundswanzig" -> "2026") and new pattern (e.g., include pattern "siebter fünfter neunzehn vierundachtzig" -> "07/05/1984") * Expand handling of German units: Units with -n ending (e.g., "Millilitern") * Add new default for ordinals, `numerals_above_nine`: Ordinals one through nine are written out (first, second, third) and ten and above are abbreviated (10th, 11th, 12th) ### Dictation formatting improvements * Improved support for `en-GB` regional spelling variations * Improved support for `de-CH`, `gsw-CH` regional spelling variations * Remove extra whitespace observed with degree and percent symbols * Updated handling of single digit numbers, now represented as numeral when followed by year/ month/ week/ hour/ day/ minute/ second(s) * Updated handling of hyphenated numbers and age dictation * Updated handling of "one twenty" (and similar number pattern) dictations * Updated handling of Month-Year dictation * Fix for an issue that occasionally caused both formatted and unformatted versions of transcript text to be emitted Update to handling of number formatting in Danish. Bug fixes: * Update to French formatting for improved number handling * Update to Danish formatting for liter abbreviation handling * Bug fix for extra whitespace being returned with commands or punctuation * Command service performance and reliability improvements ### Dictation Formatting now supported in more languages Dictation formatting is now available in **English, German, and French**. There is also limited support in **Danish** with additional improvements in progress. See detailed API specification [here](/api-reference/transcribe#param-formatting) and documentation of all available formatting options [here](/stt/formatting). ### Improved handling of initialisms in German dictation An updated version of the German (`de`) language model was released with improved handling of dictated initialisms: spoken letters, numbers, and abbreviations. Over 800 additional medical abbreviations were introduced to the system, with focused improvement on dental tooth exams, adding support for ICD-10 codes, and ability to have individual letters/numbers returned from the model. ### Audio input validation Improved server-side validation of streamed audio - if audio does not meet requirements outlined [here](/stt/audio#supported-audio-formats), then a `400 Invalid Audio` error will be returned. ### Addition of new web socket event: `flush` Provide clients the ability to force clear the audio buffer without closing the connection. In response to a `"type": "flush"` message from the client, the server will return recognized text and/or commands and respond with `"type": "flushed"`, and keep the web socket connection open. See more details on `/transcribe` [here](/api-reference/transcribe#flush-the-audio-buffer) and `/streams` [here](/api-reference/streams#flush-the-audio-buffer). ### Update to Swiss German language codes There are now two different language codes that may be used for Swiss German use cases: * **Swiss German** (language code `gsw-CH`), where dialectical Swiss German is spoken (recommended for AI scribe use cases) * **Swiss High German** (language code `de-CH`), where Swiss High German is spoken (recommended for dictation use cases) See more details [here](/stt/languages), or please [contact us](mailto:help@corti.ai) if you need further assistant in selecting the best language code. ### Updated Danish language model Updated Danish (`da`) language model is now available for dictation (`/transcribe`) and ambient (`/streams`) use cases. The language is rated as `premier` tier as over 158,000 medical terms were included in the training and validation data sets. Bug fixes: * Fixed an issue that prevented `automaticPunctuation` parameter from working as expected. Furthermore, `spokenPunctuation` and `automaticPunctuation` are mutually exclusive: Only one of these parameters should be set to `true` in a given `/transcribe` configuration, and if both settings are present and set to true, then `spokenPunctuation` will take precedence. See more detail [here](/stt/punctuation). ### Updated French language model Updated French (`fr`) language model is now available for dictation (`/transcribe`) and ambient (`/streams`) use cases. The language is rated as `premier` since over 170,000 medical terms were included in the training and validation data sets. ### Dictation now supported in Dutch and Norwegian Norwegian (`no`) and Dutch (`nl`) are now available for use on the `/transcribe` endpoint for dictation use cases. As a result they have been moved to the `enhanced` language tier. Additionally, Hungarian (`hu`) medical terminology has been expanded and improved. See more information about supported functionality for dictation [here](/stt/transcribe) and language tiers [here](/stt/languages). Feature limitation: * Some of the recent `/transcripts` endpoint language model updates (`de`, `en`, `en-GB`, `fr`) have been rolled back so that performance of asynchronous audio file processing can be improved. Infrastructure updates are underway to improve model performance. * Speech-to-text accuracy from `/transcripts` asynchronous audio file processing may be degraded as compared to real-time audio processing via the `/streams` and `/transcribe` APIs, which are not impacted by this issue. ### Updated English language model Updated English (`en`) language model is now available for dictation (`/transcribe`), ambient (`/streams`), and transcription (`/transcripts`) use cases. This version is rated as `premier` tier as over 170,000 medical terms were included in the training and validation data sets. ### Improved transcoding support in speech-to-text APIs Update to both streaming and asynchronous audio file processing so that **transcoding is supported**. Previously audio files were required to conform to 16-bit, 16kHz formatting. Now, assuming proper file types are used, any precision and sample rate are accepted. See more details [here](/stt/audio). Parameter deprecation: * Definition of the parameter `modelName` is no longer required in `/transcripts` API requests. The latest and greatest model available per language will be applied automatically. * If the argument is included in the request, then it will be ignored. If the configuration is otherwise valid, then the request will process as expected. See full specification [here](/api-reference/transcripts/create-transcript). Feature limitation: * The interim (preview) results feature of the `/transcribe` API is not performing as expected. As a result, it is being disabled for most languages to prevent issues with speech-to-text latency and accuracy. * If the `interimResults` parameter is included in the request for a language that does not support this functionality it will be ignored and, so long as the request is otherwise valid, the configuration will be accepted. ### Conversational transcripts now supported in Arabic Arabic (language code `ar`) is now available for ambient documentation workflow - Capture conversations spoken in Arabic via the `/streams` API. ### New dictation functionality available: `Formatting` New dictation functionality available: **Formatting**. Take control over how dates, time, units, and numbers should be transcribed in the streaming speech to text output. See detailed API specification [here](/api-reference/transcribe#param-formatting) and documentation of all available formatting options [here](/stt/formatting). ### Updated German language model Updated German (de) language model is now available for dictation (`/transcribe`), ambient (`/streams`), and transcription (`/transcripts`) use cases. This version is rated `premier` tier as over 150,000 medical terms were included in the training and validation data sets. ### Dictation now supported in Hungarian Hungarian (hu) Enhanced language model is now available for dictation (`/transcribe`). Swedish (sv) Enhanced language model is now available for dictation (`/transcribe`). Updated Norwegian (no) and Swedish (sv) Base language models are now available for ambient documentation (`/streams`) and transcription (`/transcripts`) workflows. Danish (da), German (de), and French (fr) Enhanced language models now available for dictation (`/transcribe`) workflows. New API endpoint, `/transcribe` now available! Use this endpoint for stateless, real-time streaming dictation workflows. See more details [here](/api-reference/transcribe/) in the API reference and [here](/stt/dictation-web/) for access to the Corti Dictation Web Component.
Swiss German (de-CH) Enhanced language model now available for ambient documentation (`/streams`) workflows. French (fr) Enhanced language model now available for ambient documentation (`/streams`) workflows. Introducing a new tier system for defining functionality and performance of speech to text language models. Read more about it [here](/stt/languages/). Updated German (de) and Swiss German (de-CH) language models available. Updated Danish (da) and Swedish (sv) language models available. Announcing the launch of Corti AI Platform. Read more about it [here](https://www.corti.ai/). The following languages are supported by Corti speech to text: English (en for US English, and en-GB for UK English), Danish (da), German (de), Swiss German (de-CH), French (fr), Swedish (sv), Spanish (es), Norwegian (nl), Dutch (no), Italian (it), and Portuguese (pt)