Skip to main content
This guide is designed to help both clinical users and IT administrators understand what to expect when migrating from legacy desktop dictation tools to Corti Symphony speech-to-text. It covers which features carry over directly, which require adaptation, and which are not currently supported.
Corti Symphony is an API-first platform. Unlike legacy dictation tools (i.e., an end-user desktop application), Symphony is embedded by developers into clinical applications and workflows.Some feature differences reflect this architectural distinction rather than capability gaps.

Feature Comparison

FeatureLegacy desktop dictationCorti Symphony
Real-time speech-to-text
Medical vocabulary accuracy
No voice training required
Cloud-based, cross-device availability
Microphone device flexibility
Automated punctuation
Spoken punctuation
Text formatting (dates, numbers, measurements)
Multiple languages
Hold-to-talk / toggle-to-talk recording
Low latency text preview
Custom commands - macros, auto-texts, templates, selection, navigation, and more
Custom dictionary / user word lists🔜
Bulleted and numbered lists☑️
Replacement rules☑️
Roman numerals☑️
Windows OS / Mac OS / application voice control☑️
Offline / local processing mode
Legend:
  ✅ Supported natively
  ☑️ Enabled with developer implementation
  🔜 Coming soon
  ❌ Not supported

For Clinicians and End Users

This section explains what your day-to-day dictation experience will look like after migration, and what to expect from each feature you may rely on today.

What works the same

Real-time dictation accuracyCorti Symphony delivers real-time speech-to-text with industry-leading medical term accuracy. Like legacy dictation apps, it requires no voice profile training, and you can start dictating immediately.
No voice training requiredSymphony adapts to your accent and speaking style automatically. There is no setup session required before your first use.
Spoken and automatic punctuationBoth punctuation modes are available. You can speak punctuation marks by name (e.g., “period,” “comma,” “new paragraph”) or enable automatic punctuation and let the model handle it based on your speech cadence.
Note that the two modes are mutually exclusive: If both are enabled, spoken punctuation takes priority as it is most recommended for desktop dictation workflow.
Text formattingSymphony automatically formats dates, times, numbers, measurements, and ordinals as you dictate. For example, converting “February third” to “3 February 2025” or “one hundred and fifty milligrams” to “150 mg.” This is equivalent to auto-formatting behavior in legacy dictation apps and is enabled by default on the Transcribe endpoint with option to configure.
Hold-to-talk and toggle-to-talkBoth recording modes are supported by Symphony. Your application vendor will configure which mode is active or how you can configure the microphone setup.
Low latency text previewAs you dictate, Symphony streams interim results to the client, giving you live feedback that your audio is being captured correctly. These previews may occasionally differ from the final transcript, which is normal. Only the finalized text should be applied for your documentation.
Microphone flexibilityCorti Symphony works with standard USB microphones, dedicated dictation microphones (such as Philips SpeechMike devices), and the built-in microphone on supported workstations. Your application may also support using a smartphone as a wireless microphone. Consult your application vendor for supported hardware.
Commands: Auto-texts and template insertionAuto-texts let you insert pre-written templates by speaking a short phrase. In Symphony, this same outcome is achieved through the Commands feature of the transcribe API: you speak a phrase, Symphony recognizes it as a command and returns the command ID to your application, and the application then inserts the appropriate template.

The practical result for clinicians is the same — speak a phrase, see your template appear — but the template library and commands need to be migrated/ adapted for your application rather than inside a user profile.
Commands: Editing and navigationThe built-in command libraries of legacy dictation apps covers hundreds of actions (select last sentence, scratch that, go to end, etc.), which can be replicated using the command feature of the Symphony transcribe API.

The exact command phrases you use may differ in your new application. Ask your application vendor for a command reference sheet specific to your deployment.

What is not supported

Custom personal vocabulary (user word lists)In legacy dictation apps, clinicians can add custom words or preferred spellings to their personal vocabulary. Symphony vocabulary management is currently in development and will be available soon. In the meantime, medical terminology accuracy is managed at the model level by Corti, and feedback on missed terms can be submitted via your application’s session feedback mechanism.
Windows/Mac OS and application voice controlDesktop dictation apps include the ability to control desktop applications itself by voice — opening applications, clicking buttons in the operating system, and navigating outside your EHR. Symphony is a speech-to-text and command API; it does not control the operating system or applications outside its integration point.

If you rely on OS-level voice control for accessibility reasons, speak to your IT team about how Symphony is integrated to ensure compatible functionality is enabled.
Offline / disconnected useSymphony is a cloud-based API and requires an internet connection. Some desktop dictation apps supports a local offline mode for environments with intermittent connectivity. If your clinical environment has unreliable network access, raise this with your IT team during migration planning and explore use of Symphony Transcripts API for recorded audio file processing.
If you had a large personal library of auto-texts, commands, and dictionary words, then work with your IT team or application vendor to migrate these into the Symphony supported configuration before go-live.

For IT administrators and developers

This section covers the technical and architectural differences relevant to deploying and configuring Corti Symphony as a replacement for legacy dictation apps.

Architecture overview

Client desktop applications are installed on Windows or Mac workstations that either process speech recognition locally on the device or via connection to the native vendor’s cloud services. Configuration is managed through an administration portal and pushed to user profiles. Corti Symphony is an API platform. All integration is done at the application level, either by embedding the Dictation Web Component for rapid deployment, integrating with the Symphony SDK, or by building a custom integration against the Transcribe WebSocket endpoint. Corti also offers two additional speech-to-text endpoints, Streams and Transcripts, for conversational transcript/facts extraction and recorded audio file processing, respectively.
There is no Symphony equivalent for a desktop dictation application. Speech input and command handling are entirely managed within your clinical application, but the Symphony SDK and Dictation Web Component simplify the effort for integrating with the API - see details here.

Choosing the right endpoint

Use caseRecommended endpoint
Clinical dictation into desktop applications or EHR fields/transcribe
Ambient documentation / conversational capture/streams
Batch processing of recorded audio/transcripts
The /transcribe endpoint is the closest functional equivalent to desktop dictation mode. It supports real-time transcription, commands, spoken and automatic punctuation, text formatting, and interim results.

Language support

Symphony supports 14+ languages across its endpoints. Feature availability varies by language; consult the languages documentation and transcribe endpoint overview for a full feature details.

Transcripts: inserting text, handling casing and spacing

Desktop dictation clients handle text insertion at cursor location. Such logic must be replicated by your client application integrated with Symphony speech to text. See more best practice details outlined here.

Commands: replacing auto-texts and step-by-step commands

Auto-texts and step-by-step commands are replaced in Symphony by the Commands API, available on the /transcribe endpoint. Each command is defined with:
  • ID: a unique identifier your application uses to trigger the right action
  • Phrases: the spoken words that activate the command
  • Variables: optional enum lists for commands that accept spoken parameters (e.g., a command that navigates to a named section)
When Symphony detects a command phrase, it returns a command object instead of, or in addition to, a transcript object over the WebSocket. Your application is responsible for executing the corresponding action: inserting a template, moving focus, deleting content, etc.
Before go-live, audit your organization’s existing auto-text and step-by-step command libraries. Map each command to its Symphony equivalent configuration and ensure your application’s command action handlers are tested end-to-end.
Key differences from legacy dictation apps:
  • Commands do not simulate keystrokes at the OS level. All actions must be handled within your application’s own logic.
  • There is no shared cross-application command registry. Commands are registered when opening the web socket as part of the configuration.

Punctuation configuration

Symphony offers two mutually exclusive punctuation modes, configured per session:
  • spokenPunctuation: true: users speak punctuation marks by name (recommended)
  • automatedPunctuation: true: the model inserts punctuation automatically
When both are set to true, spokenPunctuation takes priority. Do not enable both unless you intend for spoken punctuation to be the active mode.
Some EHR text fields apply “smart punctuation” transformations (e.g., converting straight quotes to curly quotes). This can interfere with Symphony’s output. Review the Smart Punctuation Integration Guide before deploying into rich-text environments.

Text formatting

Formatting of dates, times, numbers, measurements, and ordinals is configurable per session on the /transcribe endpoint. When no formatting parameters are supplied, sensible locale-aware defaults are applied automatically by the server (e.g., en defaults to 12-hour time and “February 3, 2025” date format; en-GB defaults to 24-hour time and “3 February 2025). See detailed available formatting options here.

Custom vocabulary

Legacy dictation apps allows administrators to deploy custom word lists and preferred spellings via the admin portal. Symphony’s vocabulary management feature is currently in development. In the interim:
  • Medical terminology is handled at the model level and covers a broad clinical vocabulary.
  • For terms that are consistently mis-recognized, use the Submit Feedback form to flag errors and make requests.
  • For terms that require a specific spelling or formatting, consider whether a Command with a fixed text output is an appropriate workaround.

Diarization

Diarization is not available on /transcribe. If your use case involves multi-speaker audio capture (e.g., telehealth consultations, ward rounds), Symphony’s diarization and multichannel are available on the /streams and /transcripts endpoints.

User profiles and cloud access

Legacy dictation apps typically store user profiles (voice model, auto-texts, settings, etc.) in a local cache or the host vendor cloud environment, accessible from any workstation with the desktop dictation client installed. Symphony does not have a user profile concept in the traditional sense. Configuration (commands, formatting, punctuation) is applied at the session level by the integrating application. This means:
  • There are no per-user settings to migrate to Symphony.
  • Personalization is handled by the application layer, not the Symphony API.
  • Cross-device consistency is ensured by your application’s own session management.

Connectivity requirements

Symphony transcribe API requires a stable internet connection. There is no offline or local processing mode. Assess network reliability at all clinical sites before migration, particularly in environments that previously relied on local fallback capabilities of legacy dictation apps.

Migration checklist

Use this checklist to structure your migration project. Before go-live
  • Confirm target EHR/application has Symphony integration in place and tested, especially in regards to transcript text insertion
  • Configure punctuation mode (spoken vs. automated) appropriate to clinical workflows
  • Test text formatting output against locale requirements (date, time, number formats)
  • Audit existing command and auto-text library and map to Symphony commands
  • Test commands which require application-layer implementation operate successfully
  • Prepare a command reference sheet for end users/ train clinical users on any changes to command phrases
  • Verify network connectivity meets Symphony’s requirements at all sites
Post go-live
  • Monitor session feedback submissions and escalate recurring mis-recognitions to Corti using the Feedback form
  • Confirm custom vocabulary requirements and track vocabulary management feature availability
  • Review interim results behaviour with end users and confirm display settings are appropriate
  • Collect clinician feedback on command coverage and add missing commands as needed

Getting help