Skip to main content
A key functionality that brings an application from speech to text to a complete dictation solution is Commands. Put your users in the driver seat to control their workflow by defining commands to insert templates, navigate the application, automate repetitive tasks, and more!

Feature availability:

/transcribe

/streams

/transcripts

See the full API specification here for more information on how to define commands in your configuration.

API Definitions

Commands are set in dictation configuration when directly calling the /transcribe API or when using the Dictation Web Component. Explanation of parameters used in /transcribe command configuration:
ParameterTypeDefinition
idstringIdentifier for the command when it is detected by the speech recognizer.
phrasesarrayWord sequence that is spoken to trigger the command.
variablesarrayPlaceholder in command phrases to define one or more words that can trigger the command. Variables can be defined as an enum list or open-ended wildcards.

Commands with enum Variables

When configured commands are recognized by the server, the command object will be returned over the web socket (as opposed to the transcript object for speech recognized text), which includes the command ID and variables that the integrating application uses to execute the defined action. Below are some examples, including both the request configuration and server response.

Constraints

  • A single command should not have a mixture of phrases with and without variables - all phrases defined for a commands including a variable must use the variable in the phrase.
  • All variables defined for a command should be used in command phrases
  • Client applications are responsible for management and execution of the actions to be triggered by recognized commands.

Enum Command Examples

Note that the id, phrases, and variables shown below are mere examples. Define these values as needed for your own commands.
Command to select text in the editor. The command includes a defined list of words that can be recognized for the select_range variable. Your application can define different delete actions for each of the options, or add more complex handling for selecting specific words mentioned in the command utterance.
Pair this with a wildcard “select command”
commands: [
    {
      id: "select_range",
      phrases: ["select {select_range}"],
      variables: [
        {
          key: "select_range",
          type: "enum",
          enum: ["all", "the last word", "the last sentence"]
        }
      ]
    }
  ]
Command to delete text. The command includes a defined list of words that can be recognized for the delete_range variable. Your application can define different delete actions for each of the options!
commands: [
    {
      id: "delete_range",
      phrases: ["delete {delete_range}"],
      variables: [
        {
          key: "delete_range",
          type: "enum",
          enum: ["everything", "the last word", "the last sentence", "that"]
        }
      ]
    }
]

Epic SmartPhrase

Command to insert an Epic SmartPhrase to pull in pre-defined text or structured data. See more about this Epic feature here.
commands: [
    {
      id: "epic_smarts",
      phrases: ["dot {smartphrase}", "pull {smartphrase}", "insert {smartphrase}"],
      variables: [
        {
          key: "smartphrase",
          type: "enum",
          enum: ["vitals", "labs", "medications", "attestation"]
        }
      ]
    }
]

Putting it All Together

The three commands defined above can be combined into one dictation configuration with accompanying Javascript to execute the actions.
The below example includes navigation commands to move between sections within the SOAP note template, delete text command to remove last word, sentence, or paragraph, and select text commands to select text within the active text field:
import { deleteLastSentence, deleteLastWord, getActiveTextarea, hideInterim, insertAtSelection, selectLastSentence, selectLastWord, setActiveTextarea, setCommandOutput, showInterim } from "./helpers.js";

// Assumes you already have:
// - dictation (the Dictation Web Component instance)
// - textareas (array of editable fields)
// - activeIndex (index of the currently active field)

const ui = {
  interimTranscript: document.getElementById("interimTranscript"),
  commandOutput: document.getElementById("commandOutput"),
};

dictation.dictationConfig = {
  primaryLanguage: "en",
  spokenPunctuation: true,
  automaticPunctuation: false,
  commands: [
    {
      id: "go_to_section",
      phrases: ["go to {section_key} section"],
      variables: [
        {
          key: "section_key",
          type: "enum",
          enum: ["subjective", "objective", "assessment", "plan", "next", "previous"],
        },
      ],
    },
    {
      id: "delete_range",
      phrases: ["delete {delete_range}"],
      variables: [
        {
          key: "delete_range",
          type: "enum",
          enum: ["everything", "the last word", "the last sentence", "that"],
        },
      ],
    },
    {
      id: "select_range",
      phrases: ["select {select_range}"],
      variables: [
        {
          key: "select_range",
          type: "enum",
          enum: ["all", "the last word", "the last sentence"],
        },
      ],
    },
  ],
};

dictation.addEventListener("transcript", (e) => {
  const { data } = e.detail;
  const textarea = getActiveTextarea({ textareas, activeIndex });

  if (data.isFinal) {
    insertAtSelection(textarea, `${data.text} `);
    hideInterim(ui);
    return;
  }

  showInterim(ui, data.rawTranscriptText);
});

dictation.addEventListener("command", (e) => {
  const { data } = e.detail;
  hideInterim(ui);

  if (data.id === "go_to_section") {
    const section = data.variables.section_key.toLowerCase();

    if (section === "next") {
      ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, activeIndex + 1));
      setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
      return;
    }

    if (section === "previous") {
      ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, activeIndex - 1));
      setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
      return;
    }

    const index = textareas.findIndex((el) => el.id.toLowerCase() === section);
    if (index !== -1) ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, index));

    setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
    return;
  }

  if (data.id === "delete_range") {
    const range = data.variables.delete_range.toLowerCase();
    const textarea = getActiveTextarea({ textareas, activeIndex });

    if (range === "everything") {
      textarea.value = "";
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last word") {
      textarea.value = deleteLastWord(textarea.value);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last sentence") {
      textarea.value = deleteLastSentence(textarea.value);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "that") {
      const start = textarea.selectionStart ?? 0;
      const end = textarea.selectionEnd ?? 0;

      textarea.value =
        start !== end ? textarea.value.slice(0, start) + textarea.value.slice(end) : deleteLastWord(textarea.value);

      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }
  }

  if (data.id === "select_range") {
    const range = data.variables.select_range.toLowerCase();
    const textarea = getActiveTextarea({ textareas, activeIndex });

    if (range === "all") {
      textarea.focus();
      textarea.setSelectionRange(0, textarea.value.length);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last word") {
      selectLastWord(textarea);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last sentence") {
      selectLastSentence(textarea);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }
  }
});

Commands with Wildcard Variables

Unlike enum variables that are defined as part of the command configuration, wildcard variables provide the ability to recognize a command based on undefined, open-ended text.

Constraints

When using wildcard variables, the following rules apply:

Configuration:

  • Trigger word required: A wildcard variable must be preceded by at least one literal word. Phrases may not begin with a wildcard variable.
  • Separated by literals: Multiple wildcard variables in a single phrase must be separated by a non-empty literal string (e.g. "select {text1} end select {text2}" is valid; "{text1} {text2}" is not).
  • No enum field: Wildcard variables do not use an enum list — omit the enum field when type is "wildcard".

Recognition:

  • A literal trigger word is required before defining a wildcard variable, such as “select” or “insert before”.
  • 2 seconds of silence must precede and follow the command phrase
  • Wildcard variables can support a maximum of 10 words to be recognized following the literal trigger text. When the word cap is reached without a following literal endpoint, the command fires automatically with the accumulated words as the variable value.
  • Commands with wildcard variables are recognized after commands with enumerated variables, so that defined matches are recognized first when there are overlapping phrase terms defined.
The 2 second silence buffer and max word limit constraints are server-side defaults. Review and refinement of these values are in progress. contact us to share details about your implementation, or for more information

Wildcard Command Examples

Adding a literal word at the end of the phrase creates an endpoint — the wildcard captures everything spoken between the trigger and the endpoint.
{
  commands: [
    {
      id: "select_text",
      phrases: ["select {utterance} end select"],
      variables: [
        {
          key: "utterance",
          type: "wildcard"
        }
      ]
    }
  ]
}
Two wildcard variables in a single phrase must be separated by a non-empty literal string — here "before" acts as the separator.
{
  commands: [
    {
      id: "insert_before_after",
      phrases: ["insert {content} before {target}"],
      variables: [
        {
          key: "content",
          type: "wildcard"
        },
        {
          key: "target",
          type: "wildcard"
        }
      ]
    }
  ]
}
Wildcard and enum variables can be combined in a single phrase. Enum variables are always matched first, so the wildcard captures whatever is spoken between the trigger and the recognized enum value.
{
  commands: [
    {
      id: "select_and_edit",
      phrases: ["select {text} and {operation}"],
      variables: [
        {
          key: "text",
          type: "wildcard"
        },
        {
          key: "operation",
          type: "enum",
          enum: ["overwrite", "delete", "replace"]
        }
      ]
    }
  ]
}

Putting it All Together

Using the “select_text” command to replace text existing in the document:

Command configuration

Command config: select_text
{
  commands: [
    {
      id: "select_text",
      phrases: ["select {utterance}"],
      variables: [
        {
          key: "utterance",
          type: "wildcard"
        }
      ]
    }
  ]
}

Speech

Dictation: “The patient is a forty year old male comma here today for annual well visit period he is known to have diabetes and hypertension and his last fasting A1c was six point eight percent period”

STT response

Transcript: “The patient is a 40-year-old male, here today for annual well visit. He is known to have diabetes and hypertension and his last fasting A1c was 6.8%.”

Speech

Dictation: “select male”

STT response

Server response: select_text command
{
    "type": "command",
    "data": {
        "id": "select_text",
        "variables": {"utterance": "male"},
        "rawTranscriptText": "select male",
        "start": 7.19,
        "end": 8.01
    }
}

Client app/ UI

Highlight: “male”

Speech

Dictation:“female”

STT response

Transcript: “female”

Client app/ UI

Insert (overwrite) text: male female


Additional Information

Best Practices

Click here for more info on how to build commands effectively.

More Examples

Click here for a library of command configurations ready for use.
Use of commands is optional for dictation configuration. They should be configured based on how the integrating application will perform the actions.Please contact us to report errors, or for more information on this feature.