Skip to main content
A key functionality that brings an application from speech to text to a complete dictation solution is Commands. Put your users in the driver seat to control their workflow by defining commands to insert templates, navigate the application, automate repetitive tasks, and more!

Feature availability:

/transcribe

/streams

/transcripts

See the full API specification here for more information on how to define commands in your configuration.

API Definitions

Commands are set in dictation configuration when directly calling the /transcribe API or when using the Dictation Web Component. Explanation of parameters used in /transcribe command configuration:
ParameterTypeDefinition
idstringIdentifier for the command when it is detected by the speech recognizer.
phrasesstring arrayWord sequence that is spoken to trigger the command.
variablesstring arrayPlaceholder in phrases to define multiple words that should trigger the command. The word options are defined as a enum list within the variables object.

Example Commands

When configured commands are recognized by the server, the command object will be returned over the web socket (as opposed to the transcript object for speech recognized text), which includes the command ID and variables that the integrating application uses to execute the defined action. Below are some examples, including both the request configuration and server response.
Note that the id, phrases, and variables shown below are mere examples. Define these values as needed for your own commands. Additionally, your application must manage the actions to be triggered by recognized commands.
Command to navigate around your application (e.g., to a section within the current document template). The command includes a defined list of words that can be recognized for the section_key variable.
commands: [
    {
      id: "go_to_section",
      phrases: ["go to {section_key} section"],
      variables: [
        {
          key: "section_key",
          type: "enum",
          enum: ["subjective", "objective", "assessment", "plan", "next", "previous"]
        }
      ]
    }
]

Select Text

Command to select text in the editor. The command includes a defined list of words that can be recognized for the select_range variable. Your application can define different delete actions for each of the options, or add more complex handling for selecting specific words mentioned in the command utterance.
commands: [
    {
      id: "select_range",
      phrases: ["select {select_range}"],
      variables: [
        {
          key: "select_range",
          type: "enum",
          enum: ["all", "the last word", "the last sentence"]
        }
      ]
    }
  ]

Delete Text

Command to delete text. The command includes a defined list of words that can be recognized for the delete_range variable. Your application can define different delete actions for each of the options!
commands: [
    {
      id: "delete_range",
      phrases: ["delete {delete_range}"],
      variables: [
        {
          key: "delete_range",
          type: "enum",
          enum: ["everything", "the last word", "the last sentence", "that"]
        }
      ]
    }
]

Putting it All Together

The three commands defined above can be combined into one dictation configuration with accompanying Javascript to execute the actions. The below example includes navigation commands to move between sections within the SOAP note template, delete text command to remove last word, sentence, or paragraph, and select text commands to select text within the active text field:
import { deleteLastSentence, deleteLastWord, getActiveTextarea, hideInterim, insertAtSelection, selectLastSentence, selectLastWord, setActiveTextarea, setCommandOutput, showInterim } from "./helpers.js";

// Assumes you already have:
// - dictation (the Dictation Web Component instance)
// - textareas (array of editable fields)
// - activeIndex (index of the currently active field)

const ui = {
  interimTranscript: document.getElementById("interimTranscript"),
  commandOutput: document.getElementById("commandOutput"),
};

dictation.dictationConfig = {
  primaryLanguage: "en",
  spokenPunctuation: true,
  automaticPunctuation: false,
  commands: [
    {
      id: "go_to_section",
      phrases: ["go to {section_key} section"],
      variables: [
        {
          key: "section_key",
          type: "enum",
          enum: ["subjective", "objective", "assessment", "plan", "next", "previous"],
        },
      ],
    },
    {
      id: "delete_range",
      phrases: ["delete {delete_range}"],
      variables: [
        {
          key: "delete_range",
          type: "enum",
          enum: ["everything", "the last word", "the last sentence", "that"],
        },
      ],
    },
    {
      id: "select_range",
      phrases: ["select {select_range}"],
      variables: [
        {
          key: "select_range",
          type: "enum",
          enum: ["all", "the last word", "the last sentence"],
        },
      ],
    },
  ],
};

dictation.addEventListener("transcript", (e) => {
  const { data } = e.detail;
  const textarea = getActiveTextarea({ textareas, activeIndex });

  if (data.isFinal) {
    insertAtSelection(textarea, `${data.text} `);
    hideInterim(ui);
    return;
  }

  showInterim(ui, data.rawTranscriptText);
});

dictation.addEventListener("command", (e) => {
  const { data } = e.detail;
  hideInterim(ui);

  if (data.id === "go_to_section") {
    const section = data.variables.section_key.toLowerCase();

    if (section === "next") {
      ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, activeIndex + 1));
      setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
      return;
    }

    if (section === "previous") {
      ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, activeIndex - 1));
      setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
      return;
    }

    const index = textareas.findIndex((el) => el.id.toLowerCase() === section);
    if (index !== -1) ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, index));

    setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
    return;
  }

  if (data.id === "delete_range") {
    const range = data.variables.delete_range.toLowerCase();
    const textarea = getActiveTextarea({ textareas, activeIndex });

    if (range === "everything") {
      textarea.value = "";
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last word") {
      textarea.value = deleteLastWord(textarea.value);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last sentence") {
      textarea.value = deleteLastSentence(textarea.value);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "that") {
      const start = textarea.selectionStart ?? 0;
      const end = textarea.selectionEnd ?? 0;

      textarea.value =
        start !== end ? textarea.value.slice(0, start) + textarea.value.slice(end) : deleteLastWord(textarea.value);

      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }
  }

  if (data.id === "select_range") {
    const range = data.variables.select_range.toLowerCase();
    const textarea = getActiveTextarea({ textareas, activeIndex });

    if (range === "all") {
      textarea.focus();
      textarea.setSelectionRange(0, textarea.value.length);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last word") {
      selectLastWord(textarea);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last sentence") {
      selectLastSentence(textarea);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }
  }
});

Best Practices

Click here for more info on how to build commands effectively.

Use of commands is optional for dictation configuration. They should be configured based on how the integrating application will perform the actions.Please contact us to report errors, or for more information on this feature.