Overview

The WebSocket Secure (WSS) /transcribe API enables real-time, bidirectional communication with the Corti system for stateless speech-to-text. Clients can send and receive structured data, including transcripts and detected commands. This documentation provides a comprehensive guide for integrating these capabilities.
This /transcribe endpoint supports real-time stateless dictation.
  • If you are looking for real-time ambient documentation interactions, you should use the /stream WSS
  • If you are looking for asynchronous transcript generation as part of an interaction, then please refer to the /transcripts endpoint

Environment Options

EnvironmentDescription
usUS-based instance
euEU-based instance

Establishing a Connection

Clients must initiate a WebSocket connection using the wss:// scheme.
When creating an interaction, the 200 response provides a websocketUrl for that interaction including the tenant-name as url parameter. The authentication for the WSS stream requires in addition to the tenant-name parameter a token parameter to pass in the Bearer access token.

Query Parameters

tenant-name
string
required
Specifies the tenant context
token
string
required
Bearer $token
  curl --request GET \
    --url wss://api.${environment}.corti.app/audio-bridge/v2/transcribe?tenant-name=${tenant}&token=Bearer%20${accessToken}

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to connect to the /transcribe endpoint.
import { CortiClient, CortiEnvironment } from "@corti/sdk";

const cortiClient = new CortiClient({
    tenantName: "YOUR_TENANT_NAME",
    environment: CortiEnvironment.BetaEu,
    auth: {
        accessToken: "YOUR_ACCESS_TOKEN"
    },
});

const transcribeSocket = await cortiClient.transcribe.connect();

Handshake Response

101 Switching Protocols

Indicates a successful WebSocket connection. Upon successful connection, send a message including the configuration to specify the input and expected output formats.

Sending Messages

Clients must send a stream configuration message and wait for a response of type CONFIG_ACCEPTED before transmitting other data. If the configuration is not valid it will return CONFIG_DENIED. The configuration must be committed within 10 seconds of opening the WebSocket, else it will time-out with CONFIG_TIMEOUT.

Basic Stream Configuration

primaryLanguage
string
required
The locale of the primary spoken language. Check https://docs.corti.ai/about/languages for more.
interimResults
bool
When true, returns interim results for reduced latency
spokenPunctuation
bool
When true, converts spoken punctuation such as period or slash into .or /.
automaticPunctuation
bool
When true, automatically punctuates and capitalizes in the final transcript.

Advanced Stream Configuration

Commands

The transcribe endpoint supports registration and detection of commands, common in dictation workflows. Extend the configuration with the following parameters to register commands that should be detected.
commands
command object[]
Provide the commands that should be registered and detected
Here is an example configuration for transcription of dictated audio in English, with interim results, spoken punctuation and automatic punctuation enabled, and example commands defined.
Configuration example
{
  primaryLanguage: "en",
  interimResults: true, 
  spokenPunctuation: true, 
  automaticPunctuation: true,
  commands: [
    {
      id: "next_section",
      phrases: ["next section", "go to next section"]
    },
    {
      id: "delete",
      phrases: ["delete that"]
    },
    {
            "id": "insert_template",
            "phrases": [
                "insert my {template_name} template",
                "insert {template_name} template"
            ],
            "variables": [
                {
                    "key": "template_name",
                    "type": "enum",
                    "enum": [
                        "radiology",
                        "referral"
                    ]
                }
            ]
    }
  ],
}

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to send configuration.
You can provide the configuration either directly when connecting, or send it as a separate message after establishing the connection:
const configuration = {
  primaryLanguage: "en",
  commands: [
    {
      id: "next_section",
      phrases: ["next section", "go to next section"]
    },
  ]
};

const transcribeSocket = await cortiClient.transcribe.connect(
  { configuration }
);

Formatting

The transcribe endpoint provides the option to configure formatting preferences. Extend the configuration with the following parameters to apply formatting that should be used when returning text output.
Formatting functionality is currently in beta status. API details subject to change ahead of general release.Defining formatting configuration is optional. When these preferences are not configured, the default values listed below will be applied automatically.
formatting
command object[]
Define formatting preferences
Here is an example configuration for transcription of dictated audio in English, with interim results, spoken punctuation enabled, and formatting options defined:
Configuration example with formatting
{
  primaryLanguage: "en",
  interimResults: true, 
  spokenPunctuation: true, 
  commands: [...],
  formatting: {       // default values:
    dates: 1,         // long format (”3 February 2025”)
    times: 2,         // 24 hour format (”16:00”)
    numbers: 1,       // single digit as words, multi-digit as number (”one, two, … nine, 10, 11”)
    units: 1,         // abbreviated ("mm", "cm", "in"…)
    abbreviations: 1, // abbreviated ("BP 120/80 mmHg")
    numericRanges: 1, // abbreviated ("1-10")
    ordinals: 1,      // abbreviated ("1st, 2nd")
  },
}

Sending audio

Raw audio data to be transcribed.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to send audio data.
transcribeSocket.sendAudio(audioChunk); // method doesn't do the chunking

Ending

To end the /transcribe session send a type: end. This will signal the server to send any remaining transcript segments and detected commands before the server sends a usage message
{
  "type":"usage",
  "credits":0.1
}
, then a message of type ended, and then closes.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to end the /transcribe session.
When using automatic configuration (passing configuration to connect), the socket will close itself without reconnecting when it receives an ENDED message. When using manual configuration, the socket will attempt to reconnect after the server closes the connection. To prevent this, you must subscribe to the ended message and manually close the connection.
const transcribeSocket = await cortiClient.transcribe.connect({
  configuration
});

transcribeSocket.sendEnd({ type: "end" });

Responses

Configuration

type
string
default:"CONFIG_ACCEPTED"
required
Returned when sending a valid configuration.
sessionId
uuid
required
Returned when sending a valid configuration.

Transcripts

type
string
default:"transcript"
required
data
string
required

Commands

type
string
default:"command"
required
data
string
required
Command response
{
  "type": "command",
  "data": {
    "id": "insert_template",
    "variables": {
      "template_name": "radiology"
    },
    "rawTranscriptText": "insert my radiology template",
    "start": 2.3,
    "end": 2.9,
  }
}

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to subscribe to responses from the /transcribe endpoint.
transcribeSocket.on("message", (message) => {
  switch (message.type) {
    case "transcript":
      console.log("Transcript:", message.data.text);
      break;
    case "command":
      console.log("Command detected:", message.data.id, message.data.variables);
      break;
    case "error":
      console.error("Error:", message.error);
      break;
    case "usage":
      console.log("Usage credits:", message.credits);
      break;
    default:
      // handle other messages
      break;
  }
});

Error Responses

type
string
required
Returned when sending an invalid configuration.Possible errors CONFIG_DENIED, CONFIG_TIMEOUT
reason
string
The reason the configuration is invalid.
sessionId
uuid
required
The session ID.
Once configuration has been accepted and the session is running, you may encounter runtime or application-level errors. These are sent as JSON objects with the following structure:
{
  "type": "error",
  "error": {
    "id": "error id",
    "title": "error title",
    "status": 400,
    "details": "error details",
    "doc":"link to documentation"
  }
}
In some cases, receiving an “error” type message will cause the stream to end and send a message of type usage and type ENDED.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to handle error messages.
With recommended configuration, configuration errors (e.g., CONFIG_DENIED, etc.) and runtime errors will both trigger the error event and automatically close the socket. You can also inspect the original message in the message handler. With manual configuration, configuration errors are only received as messages (not as error events), and you must close the socket manually to avoid reconnection.
const transcribeSocket = await cortiClient.transcribe.connect({
  configuration
});

transcribeSocket.on("error", (error) => {
  // Emitted for both configuration and runtime errors
  console.error("Error event:", error);
  // The socket will close itself automatically
});

// still can be accessed with normal "message" subscription
transcribeSocket.on("message", (message) => {
  if (
    message.type === "CONFIG_DENIED" ||
    message.type === "CONFIG_TIMEOUT"
  ) {
    console.log("Configuration error (message):", message);
  }

  if (message.type === "error") {
    console.log("Runtime error (message):", message);
  }
});