Dictation Migration Guide - Corti API Documentation

This guide is designed to help both clinical users and IT administrators understand what to expect when migrating from legacy desktop dictation tools to Corti Symphony speech-to-text. It covers which features carry over directly, which require adaptation, and which are not currently supported.

Corti Symphony is an API-first platform. Unlike legacy dictation tools (i.e., an end-user desktop application), Symphony is embedded by developers into clinical applications and workflows.Some feature differences reflect this architectural distinction rather than capability gaps.

Feature Comparison

Feature	Legacy desktop dictation	Corti Symphony
Real-time speech-to-text	✅	✅
Medical vocabulary accuracy	✅	✅
No voice training required	✅	✅
Cloud-based, cross-device availability	✅	✅
Microphone device flexibility	✅	✅
Automated punctuation	✅	✅
Spoken punctuation	✅	✅
Text formatting (dates, numbers, measurements)	✅	✅
Multiple languages	✅	✅
Hold-to-talk / toggle-to-talk recording	✅	✅
Low latency text preview	✅	✅
Custom commands - macros, auto-texts, templates, selection, navigation, and more	✅	✅
Custom dictionary / user word lists	✅	☑️
Bulleted and numbered lists	✅	☑️
Replacement rules	✅	☑️
Roman numerals	✅	☑️
Windows OS / Mac OS / application voice control	✅	☑️
Offline / local processing mode	✅	❌

Legend:
✅ Supported natively
☑️ Enabled with developer implementation
🔜 Coming soon
❌ Not supported

For Clinicians and End Users

This section explains what your day-to-day dictation experience will look like after migration, and what to expect from each feature you may rely on today.

What works the same


Real-time dictation accuracy	Corti Symphony delivers real-time speech-to-text with industry-leading medical term accuracy. Like legacy dictation apps, it requires no voice profile training, and you can start dictating immediately.
No voice training required	Symphony adapts to your accent and speaking style automatically. There is no setup session required before your first use.
Spoken and automatic punctuation	Both punctuation modes are available. You can speak punctuation marks by name (e.g., “period,” “comma,” “new paragraph”) or enable automatic punctuation and let the model handle it based on your speech cadence. Note that the two modes are mutually exclusive: If both are enabled, spoken punctuation takes priority as it is most recommended for desktop dictation workflow.
Text formatting	Symphony automatically formats dates, times, numbers, measurements, and ordinals as you dictate. For example, converting “February third” to “3 February 2025” or “one hundred and fifty milligrams” to “150 mg.” This is equivalent to auto-formatting behavior in legacy dictation apps and is enabled by default on the Transcribe endpoint with option to configure.
Hold-to-talk and toggle-to-talk	Both recording modes are supported by Symphony. Your application vendor will configure which mode is active or how you can configure the microphone setup.
Low latency text preview	As you dictate, Symphony streams interim results to the client, giving you live feedback that your audio is being captured correctly. These previews may occasionally differ from the final transcript, which is normal. Only the finalized text should be applied for your documentation.
Microphone flexibility	Corti Symphony works with standard USB microphones, dedicated dictation microphones (such as Philips SpeechMike devices), and the built-in microphone on supported workstations. Your application may also support using a smartphone as a wireless microphone. Consult your application vendor for supported hardware.
Commands: Auto-texts and template insertion	Auto-texts let you insert pre-written templates by speaking a short phrase. In Symphony, this same outcome is achieved through the Commands feature of the transcribe API: you speak a phrase, Symphony recognizes it as a command and returns the command ID to your application, and the application then inserts the appropriate template. The practical result for clinicians is the same — speak a phrase, see your template appear — but the template library and commands need to be migrated/ adapted for your application rather than inside a user profile.
Commands: Editing and navigation	The built-in command libraries of legacy dictation apps covers hundreds of actions (select last sentence, scratch that, go to end, etc.), which can be replicated using the command feature of the Symphony transcribe API. The exact command phrases you use may differ in your new application. Ask your application vendor for a command reference sheet specific to your deployment.
Custom personal vocabulary (user word lists)	In legacy dictation apps, clinicians can add custom words or preferred spellings to their personal vocabulary. Symphony supports terminology management as part of transcribe configuration - be sure to migrate your configured word lists.

What is not supported


Windows/Mac OS and application voice control	Desktop dictation apps include the ability to control desktop applications itself by voice — opening applications, clicking buttons in the operating system, and navigating outside your EHR. Symphony is a speech-to-text and command API; it does not control the operating system or applications outside its integration point. If you rely on OS-level voice control for accessibility reasons, speak to your IT team about how Symphony is integrated to ensure compatible functionality is enabled.
Offline / disconnected use	Symphony is a cloud-based API and requires an internet connection. Some desktop dictation apps supports a local offline mode for environments with intermittent connectivity. If your clinical environment has unreliable network access, raise this with your IT team during migration planning and explore use of Symphony Transcripts API for recorded audio file processing.

If you had a large personal library of auto-texts, commands, and dictionary words, then work with your IT team or application vendor to migrate these into the Symphony supported configuration before go-live.

For IT administrators and developers

This section covers the technical and architectural differences relevant to deploying and configuring Corti Symphony as a replacement for legacy dictation apps.

Architecture overview

Client desktop applications are installed on Windows or Mac workstations that either process speech recognition locally on the device or via connection to the native vendor’s cloud services. Configuration is managed through an administration portal and pushed to user profiles. Corti Symphony is an API platform. All integration is done at the application level, either by embedding the Dictation Web Component for rapid deployment, integrating with the Symphony SDK, or by building a custom integration against the Transcribe WebSocket endpoint. Corti also offers two additional speech-to-text endpoints, Streams and Transcripts, for conversational transcript/facts extraction and recorded audio file processing, respectively.

There is no Symphony equivalent for a desktop dictation application. Speech input and command handling are entirely managed within your clinical application, but the Symphony SDK and Dictation Web Component simplify the effort for integrating with the API - see details here.

Choosing the right endpoint

Use case	Recommended endpoint
Clinical dictation into desktop applications or EHR fields	`/transcribe`
Ambient documentation / conversational capture	`/streams`
Batch processing of recorded audio	`/transcripts`

The /transcribe endpoint is the closest functional equivalent to desktop dictation mode. It supports real-time transcription, commands, spoken and automatic punctuation, text formatting, and interim results.

Language support

Symphony supports 14+ languages across its endpoints. Feature availability varies by language; consult the languages documentation and transcribe endpoint overview for a full feature details.

Transcripts: inserting text, handling casing and spacing

Desktop dictation clients handle text insertion at cursor location. Such logic must be replicated by your client application integrated with Symphony speech to text. See more best practice details outlined here.

Commands: replacing auto-texts and step-by-step commands

Auto-texts and step-by-step commands are replaced in Symphony by the Commands API, available on the /transcribe endpoint. Each command is defined with:

ID: a unique identifier your application uses to trigger the right action
Phrases: the spoken words that activate the command
Variables: optional enum lists for commands that accept spoken parameters (e.g., a command that navigates to a named section)

When Symphony detects a command phrase, it returns a command object instead of, or in addition to, a transcript object over the WebSocket. Your application is responsible for executing the corresponding action: inserting a template, moving focus, deleting content, etc.

Before go-live, audit your organization’s existing auto-text and step-by-step command libraries. Map each command to its Symphony equivalent configuration and ensure your application’s command action handlers are tested end-to-end.

Key differences from legacy dictation apps:

Commands do not simulate keystrokes at the OS level. All actions must be handled within your application’s own logic.
There is no shared cross-application command registry. Commands are registered when opening the web socket as part of the configuration.

Punctuation configuration

Symphony offers two mutually exclusive punctuation modes, configured per session:

spokenPunctuation: true: users speak punctuation marks by name (recommended)
automatedPunctuation: true: the model inserts punctuation automatically

When both are set to true, spokenPunctuation takes priority. Do not enable both unless you intend for spoken punctuation to be the active mode.

Some EHR text fields apply “smart punctuation” transformations (e.g., converting straight quotes to curly quotes). This can interfere with Symphony’s output. Review the Smart Punctuation Integration Guide before deploying into rich-text environments.

Text formatting

Formatting of dates, times, numbers, measurements, and ordinals is configurable per session on the /transcribe endpoint. When no formatting parameters are supplied, sensible locale-aware defaults are applied automatically by the server (e.g., en defaults to 12-hour time and “February 3, 2025” date format; en-GB defaults to 24-hour time and “3 February 2025). See detailed available formatting options here.

Custom vocabulary

Legacy dictation apps allows administrators to deploy custom word lists and preferred spellings via the admin portal. Symphony supports terminology management as part of transcribe configuration - see details here. Additional considerations:

For terms that are consistently mis-recognized, use the Submit Feedback form to flag errors and make requests.
For terms that require a specific spelling or formatting, consider whether a replacement rule or command can supplement recognition performance and workflow needs.

Diarization

Diarization is not available on /transcribe. If your use case involves multi-speaker audio capture (e.g., telehealth consultations, ward rounds), Symphony’s diarization and multichannel are available on the /streams and /transcripts endpoints.

User profiles and cloud access

Legacy dictation apps typically store user profiles (voice model, auto-texts, settings, etc.) in a local cache or the host vendor cloud environment, accessible from any workstation with the desktop dictation client installed. Symphony does not have a user profile concept in the traditional sense. Configuration (commands, formatting, punctuation) is applied at the session level by the integrating application. This means:

There are no per-user settings to migrate to Symphony.
Personalization is handled by the application layer, not the Symphony API.
Cross-device consistency is ensured by your application’s own session management.

Connectivity requirements

Symphony transcribe API requires a stable internet connection. There is no offline or local processing mode. Assess network reliability at all clinical sites before migration, particularly in environments that previously relied on local fallback capabilities of legacy dictation apps.

Migration checklist

Use this checklist to structure your migration project. Before go-live

Confirm target EHR/application has Symphony integration in place and tested, especially in regards to transcript text insertion
Configure punctuation mode (spoken vs. automated) appropriate to clinical workflows
Test text formatting output against locale requirements (date, time, number formats)
Audit existing command and auto-text library and map to Symphony commands
Test commands which require application-layer implementation operate successfully
Prepare a command reference sheet for end users/ train clinical users on any changes to command phrases
Verify network connectivity meets Symphony’s requirements at all sites

Post go-live

Monitor session feedback submissions and escalate recurring mis-recognitions to Corti using the Feedback form
Confirm custom vocabulary requirements and migrate word lists to Symphony configuration
Review interim results behaviour with end users and confirm display settings are appropriate
Collect clinician feedback on command coverage and add missing commands as needed

Getting help

For technical documentation, see the API reference.
To report issues or submit feedback, use the feedback form.
For support, visit help.corti.app.

Smart Punctuation Guide Recommended Microphones

⌘I

​Feature Comparison

​For Clinicians and End Users

​What works the same

​What is not supported

​For IT administrators and developers

​Architecture overview

​Choosing the right endpoint

​Language support

​Transcripts: inserting text, handling casing and spacing

​Commands: replacing auto-texts and step-by-step commands

​Punctuation configuration

​Text formatting

​Custom vocabulary

​Diarization

​User profiles and cloud access

​Connectivity requirements

​Migration checklist

​Getting help

Feature Comparison

For Clinicians and End Users

What works the same

What is not supported

For IT administrators and developers

Architecture overview

Choosing the right endpoint

Language support

Transcripts: inserting text, handling casing and spacing

Commands: replacing auto-texts and step-by-step commands

Punctuation configuration

Text formatting

Custom vocabulary

Diarization

User profiles and cloud access

Connectivity requirements

Migration checklist

Getting help