Quickstart - Real-Time Stateless Dictation

This guide shows how to authenticate with Corti and run your first real-time dictation session using the /transcribe WebSocket endpoint.

Authentication

Create an account and access token

See details here

Open a `/transcribe` WebSocket

Initiate real-time, bi-directional communication

Base URL: wss://api.{environment}.corti.app/audio-bridge/v2/transcribeRequired query parameters:

tenant-name
token (URL-encoded Bearer <access_token>)

Full URL template:

wss://api.{environment}.corti.app/audio-bridge/v2/transcribe?tenant-name={tenant-name}&token=Bearer%20{access_token}

Example:

JavaScript

import WebSocket from "ws";

const env = "eu"; // or "us"
const tenant = "<YOUR_TENANT_NAME>";
const token = "<ACCESS_TOKEN>";

const wsUrl =
`wss://api.${env}.corti.app/audio-bridge/v2/transcribe` +
`?tenant-name=${encodeURIComponent(tenant)}` +
`&token=${encodeURIComponent(`Bearer ${token}`)}`;

const ws = new WebSocket(wsUrl);

Send configuration

Required within 10s of opening connection

After the wss connection is opened, send a config message within 10 seconds or the server closes the socket with CONFIG_TIMEOUT.Example configuration message:

const configurationMessage = {
    primaryLanguage: "en",
    spokenPunctuation: true,
    commands: [
    {
        id: "next_section",
        phrases: ["next section", "go to next section"],
    },
    {
        id: "insert_template",
        phrases: [
        "insert my {template_name} template",
        "insert {template_name} template",
        ],
        variables: [
        {
            key: "template_name",
            type: "enum",
            enum: ["soap", "radiology", "referral"],
        },
        ],
    },
    ],
    formatting: {
    dates: "long_text",
    times: "h24",
    numbers: "numerals_above_nine",
    measurements: "abbreviated",
    numericRanges: "numerals",
    ordinals: "numerals",
    },
};

Send the configuration as soon as the socket opens:

JavaScript

ws.on("open", () => {
ws.send(JSON.stringify(configurationMessage));
});

Wait for a message with {"type": "CONFIG_ACCEPTED"} before sending audio. If you receive CONFIG_DENIED or CONFIG_TIMEOUT, close the socket and fix the configuration.

Real-Time Stateless Dictation

Stream audio and receive transcripts

Send audio frames

Send audio as binary WebSocket messages. See details on supported audio formats here.

// audioChunk: Buffer or Uint8Array containing raw audio
ws.send(audioChunk);

Send continuous stream of 250ms audio chunks while recording is active - no overlapping frames.

Handle responses

The server sends messages with different type values, for example:

{
"type": "transcript",
"data": {
    "text": "Patient reports mild chest pain.",
    "rawTranscriptText": "patient reports mild chest pain period",
    "start": 0.0,
    "end": 3.2,
    "isFinal": true
}
}

Basic message handler:

JavaScript

ws.on("message", (raw) => {
const msg = JSON.parse(raw.toString());

switch (msg.type) {
    case "transcript":
    console.log("Transcript:", msg.data.text);
    break;
    case "command":
    console.log("Command:", msg.data.id, msg.data.variables);
    break;
    case "usage":
    console.log("Usage credits:", msg.credits);
    break;
    case "error":
    console.error("Error:", msg.error);
    break;
    default:
    console.log("Other message:", msg);
}
});

Flush the audio buffer (optional)

Force results to be returned from server

Use flush to force pending transcript segments and/or dictation commands to be returned, without closing the session. This is useful to separate dictation into logical sections.

ws.send(JSON.stringify({ type: "flush" }));

Wait for type: "flushed" before treating the section as complete.

End the session

Sending the `end` message

Send end when you are done sending audio:

ws.send(JSON.stringify({ type: "end" }));

The server then:

Emits any remaining transcript or command messages.
Sends usage info, for example:

{ "type": "usage", "credits": 0.1 }

Sends:

{ "type": "ended" }

Closes the WebSocket.

You can also close the client socket explicitly after receiving ended:

ws.on("message", (raw) => {
const msg = JSON.parse(raw.toString());
if (msg.type === "ended") {
    ws.close();
}
});

Basic end-to-end example

JavaScript dictation app

import WebSocket from "ws";
import fetch from "node-fetch";

const CLIENT_ID = "<CLIENT_ID>";
const CLIENT_SECRET = "<CLIENT_SECRET>";
const TENANT = "<TENANT>";
const ENV = "eu"; // or "us"

async function getAccessToken() {
const res = await fetch(
    `https://auth.${ENV}.corti.app/realms/${TENANT}/protocol/openid-connect/token`,
    {
    method: "POST",
    headers: { "Content-Type": "application/x-www-form-urlencoded" },
    body: new URLSearchParams({
        client_id: CLIENT_ID,
        client_secret: CLIENT_SECRET,
        grant_type: "client_credentials",
        scope: "openid",
    }),
    }
);

if (!res.ok) throw new Error(`Token error: ${res.status}`);

const json = await res.json();
return json.access_token;
}

async function run() {
const token = await getAccessToken();

const wsUrl =
    `wss://api.${ENV}.corti.app/audio-bridge/v2/transcribe` +
    `?tenant-name=${encodeURIComponent(TENANT)}` +
    `&token=${encodeURIComponent(`Bearer ${token}`)}`;

const ws = new WebSocket(wsUrl);

ws.on("open", () => {
    const config = {
        primaryLanguage: "en",
        spokenPunctuation: true,
        automaticPunctuation: false,
        commands: [
        {
            id: "next_section",
            phrases: ["next section", "go to next section"],
        },
        ],
        formatting: {
        dates: "long_text",
        times: "h24",
        numbers: "numerals_above_nine",
        measurements: "abbreviated",
        numericRanges: "numerals",
        ordinals: "numerals",
        },
    };

    ws.send(JSON.stringify(config));
});

ws.on("message", (raw) => {
    const msg = JSON.parse(raw.toString());

    if (msg.type === "transcript") {
    console.log("Transcript:", msg.data.text);
    } else if (msg.type === "command") {
    console.log("Command:", msg.data.id, msg.data.variables);
    } else if (msg.type === "error") {
    console.error("Error:", msg.error);
    } else if (msg.type === "ended") {
    ws.close();
    }
});

ws.on("error", (err) => {
    console.error("Socket error:", err);
});

// Example: close the session after 30 seconds if you are not streaming real audio
setTimeout(() => {
    ws.send(JSON.stringify({ type: "end" }));
}, 30000);
}

run().catch((err) => {
console.error("Fatal error:", err);
});

Guides

SDKs

Start Building

Release Notes

Resources

Quickstart - Real-Time Stateless Dictation

Authentication

Open a `/transcribe` WebSocket

Send configuration

Real-Time Stateless Dictation

Send audio frames

Handle responses

Flush the audio buffer (optional)

End the session

Basic end-to-end example

Guides

SDKs

Start Building

Release Notes

Resources

Authentication

Open a `/transcribe` WebSocket

Send configuration

Real-Time Stateless Dictation

​Send audio frames

​Handle responses

Flush the audio buffer (optional)

End the session

Basic end-to-end example

Send audio frames

Handle responses