Skip to main content
The @corti/dictation-web package provides a set of custom elements that handle microphone management, audio streaming, transcript display, and voice commands on top of the Corti API. It works with any frontend framework or plain HTML. The library provides two usage modes:
  1. <corti-dictation> — opinionated, all-in-one component with built-in UI (recommended for most use cases)
  2. Modular components — individual building blocks (<dictation-root>, <dictation-recording-button>, etc.) for fully custom layouts

Installation

npm install @corti/dictation-web

Module import

// Side-effect import -- registers all custom elements
import "@corti/dictation-web";

// Named import -- access component classes directly
import { CortiDictation } from "@corti/dictation-web";

Quick start

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Corti Dictation</title>
    <script type="module" src="https://cdn.jsdelivr.net/npm/@corti/dictation-web/dist/bundle.js"></script>
</head>
<body>
    <corti-dictation id="dictation"></corti-dictation>

    <script>
        const dictation = document.getElementById("dictation");

        dictation.addEventListener("ready", () => {
            console.log("Dictation component is ready");
        });

        dictation.addEventListener("transcript", (event) => {
            const { text, isFinal } = event.detail.data;
            console.log(`${isFinal ? "Final" : "Interim"}: ${text}`);
        });

        // Set your access token (obtained from your backend)
        dictation.accessToken = "YOUR_ACCESS_TOKEN";

        // Configure dictation (required)
        dictation.dictationConfig = {
            primaryLanguage: "en",
            automaticPunctuation: true,
            spokenPunctuation: true,
        };
    </script>
</body>
</html>
The component handles microphone permissions, device selection, and audio streaming automatically. You only need to provide authentication and listen for events.

Configuration

Set dictationConfig as a JavaScript property to configure the transcription engine. The type matches the config message from the Transcribe WebSocket API.
const dictation = document.querySelector("corti-dictation");

dictation.dictationConfig = {
    primaryLanguage: "en",
    automaticPunctuation: true,
};
See the Transcribe API Reference for the full configuration schema.

Modular components

For custom UI layouts, use individual components inside a <dictation-root> parent:
All modular components require a <dictation-root> parent to provide context. They cannot be used standalone.
<dictation-root id="root" accessToken="YOUR_TOKEN">
    <div class="custom-layout">
        <dictation-recording-button></dictation-recording-button>
        <dictation-settings-menu settingsEnabled="device,language,keybinding"></dictation-settings-menu>
    </div>
</dictation-root>

<script>
    const root = document.getElementById("root");
    root.addEventListener("transcript", (e) => {
        console.log(e.detail.data.text);
    });
</script>
ComponentDescription
<dictation-root>Context provider. Same properties as <corti-dictation> (auth, config, devices, keybindings). Add noWrapper attribute to remove default styling.
<dictation-recording-button>Start/stop button with audio visualization. Supports allowButtonFocus. Has startRecording(), stopRecording(), toggleRecording(), openConnection(), closeConnection() methods.
<dictation-settings-menu>Settings panel with device and language selectors. Supports settingsEnabled.
<dictation-device-selector>Standalone device dropdown. Supports disabled.
<dictation-language-selector>Standalone language dropdown. Supports disabled.
<dictation-keybinding-selector>Keybinding configuration. Supports keybindingType ("push-to-talk" or "toggle-to-talk") and disabled.
See the API Reference for full per-component property and method tables.

Keyboard shortcuts

The component supports two keybinding modes:
ModeBehaviorDefault key
Push-to-talkHold key to record, release to stopSpace
Toggle-to-talkPress to start, press again to stopEnter
Configure via attributes:
<corti-dictation
    settingsEnabled="device,language,keybinding"
    pushToTalkKeybinding="Space"
    toggleToTalkKeybinding="Enter"
></corti-dictation>
Keys are specified as event.key names (e.g. "Space", "k", "Meta") or event.code values (e.g. "KeyK", "Backquote"). Modifier combinations are not supported.
When both keybindings are set to the same key, toggle-to-talk takes priority.

Preventing keybinding activation

Use the keybinding-activated event to conditionally block keybindings:
dictation.addEventListener("keybinding-activated", (event) => {
    if (document.activeElement.tagName === "TEXTAREA") {
        event.preventDefault(); // Don't trigger recording while typing
    }
});

Voice commands

Voice commands let users control your application by speaking phrases. Configure commands via dictationConfig.commands — each command has a phrases pattern and optional variables. The component emits a command event when a phrase is matched. See the Transcribe API reference for the full command schema.
import { deleteLastSentence, deleteLastWord, getActiveTextarea, hideInterim, insertAtSelection, selectLastSentence, selectLastWord, setActiveTextarea, setCommandOutput, showInterim } from "./helpers.js";

// Assumes you already have:
// - dictation (the Dictation Web Component instance)
// - textareas (array of editable fields)
// - activeIndex (index of the currently active field)

const ui = {
  interimTranscript: document.getElementById("interimTranscript"),
  commandOutput: document.getElementById("commandOutput"),
};

dictation.dictationConfig = {
  primaryLanguage: "en",
  spokenPunctuation: true,
  automaticPunctuation: false,
  commands: [
    {
      id: "go_to_section",
      phrases: ["go to {section_key} section"],
      variables: [
        {
          key: "section_key",
          type: "enum",
          enum: ["subjective", "objective", "assessment", "plan", "next", "previous"],
        },
      ],
    },
    {
      id: "delete_range",
      phrases: ["delete {delete_range}"],
      variables: [
        {
          key: "delete_range",
          type: "enum",
          enum: ["everything", "the last word", "the last sentence", "that"],
        },
      ],
    },
    {
      id: "select_range",
      phrases: ["select {select_range}"],
      variables: [
        {
          key: "select_range",
          type: "enum",
          enum: ["all", "the last word", "the last sentence"],
        },
      ],
    },
  ],
};

dictation.addEventListener("transcript", (e) => {
  const { data } = e.detail;
  const textarea = getActiveTextarea({ textareas, activeIndex });

  if (data.isFinal) {
    insertAtSelection(textarea, `${data.text} `);
    hideInterim(ui);
    return;
  }

  showInterim(ui, data.rawTranscriptText);
});

dictation.addEventListener("command", (e) => {
  const { data } = e.detail;
  hideInterim(ui);

  if (data.id === "go_to_section") {
    const section = data.variables.section_key.toLowerCase();

    if (section === "next") {
      ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, activeIndex + 1));
      setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
      return;
    }

    if (section === "previous") {
      ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, activeIndex - 1));
      setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
      return;
    }

    const index = textareas.findIndex((el) => el.id.toLowerCase() === section);
    if (index !== -1) ({ activeIndex } = setActiveTextarea({ textareas, activeIndex }, index));

    setCommandOutput(ui, `Command: ${data.id} (section: ${section})`);
    return;
  }

  if (data.id === "delete_range") {
    const range = data.variables.delete_range.toLowerCase();
    const textarea = getActiveTextarea({ textareas, activeIndex });

    if (range === "everything") {
      textarea.value = "";
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last word") {
      textarea.value = deleteLastWord(textarea.value);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last sentence") {
      textarea.value = deleteLastSentence(textarea.value);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "that") {
      const start = textarea.selectionStart ?? 0;
      const end = textarea.selectionEnd ?? 0;

      textarea.value =
        start !== end ? textarea.value.slice(0, start) + textarea.value.slice(end) : deleteLastWord(textarea.value);

      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }
  }

  if (data.id === "select_range") {
    const range = data.variables.select_range.toLowerCase();
    const textarea = getActiveTextarea({ textareas, activeIndex });

    if (range === "all") {
      textarea.focus();
      textarea.setSelectionRange(0, textarea.value.length);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last word") {
      selectLastWord(textarea);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }

    if (range === "the last sentence") {
      selectLastSentence(textarea);
      setCommandOutput(ui, `Command: ${data.id} (range: ${range})`);
      return;
    }
  }
});

Attribute formatting

TypeFormatExample
BooleanPresence = true, absence = false<corti-dictation debug-display-audio>
StringAttribute valueaccessToken="token"
ArrayComma-separatedsettingsEnabled="device,language"
ObjectJavaScript property onlydictation.authConfig = { ... }

Resources


For support or questions, reach out through help.corti.app