Skip to main content

Overview

The WebSocket Secure (WSS) /stream API enables real-time, bidirectional communication with the Corti system for interaction streaming. Clients can send and receive structured data, including transcripts and fact updates. Learn more about FactsR™ here. This documentation provides a structured guide for integrating the Corti WSS API for real-time interaction streaming.
This /stream endpoint supports real-time ambient documentation interactions and clinical decision support workflows.
  • If you are looking for a stateless endpoint that is geared towards front-end dictation workflows you should use the /transcribe WSS
  • If you are looking for asynchronous ambient documentation interactions, then please refer to the /documents endpoint

1. Establishing a Connection

Clients must initiate a WebSocket connection using the wss:// scheme and provide a valid interaction ID in the URL.
When creating an interaction, the 200 response provides a websocketUrl for that interaction including the tenant-name as url parameter. The authentication for the WSS stream requires in addition to the tenant-name parameter a token parameter to pass in the Bearer access token.

Path Parameters

id
uuid
required
Unique interaction identifier

Query Parameters

environment
enum
required
eu or us
tenant-name
string
required
Specifies the tenant context
token
string
required
Bearer $token

Using SDK

You can use the Corti SDK (currently in “beta”) to connect to a stream endpoint.
import { CortiClient, CortiEnvironment } from "@corti/sdk";

const cortiClient = new CortiClient({
    tenantName: "YOUR_TENANT_NAME",
    environment: CortiEnvironment.Eu,
    auth: {
        accessToken: "YOUR_ACCESS_TOKEN"
    },
});

const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>"
});

2. Handshake Responses

101 Switching Protocols

Indicates a successful WebSocket connection. Upon successful connection, send a config message to define the configuration: Specify the input language and expected output preferences.
The config message must be sent within 10 seconds of the web socket being opened to prevent CONFIG-TIMEOUT, which will require establishing a new wss connection.

3. Sending Messages

Configuration

Declare your /stream configuration using the message "type": "config" followed by defining the "configuration": {<config details, per options below>}. Defining the type is required along with transcription: primaryLanguage and mode: type and outputLocale configuration parameters. The other parameters are optional for use, depending on your need and workflow.
Configuration notes:
  • Clients must send a stream configuration message and wait for a response of type CONFIG_ACCEPTED before transmitting other data.
  • If the configuration is not valid it will return CONFIG_DENIED.
  • The configuration must be committed within 10 seconds of opening the WebSocket, else it will time-out with CONFIG_TIMEOUT.
type
string
default:"config"
required
configuration
object
required

Example

wss:/stream configuration example
{
  "type": "config",
  "configuration": {
    "transcription": {
      "primaryLanguage": "en",
      "isDiarization": false,
      "isMultichannel": false,
      "participants": [
        {
          "channel": 0,
          "role": "multiple"
        }
      ]
    },
    "mode": {
      "type": "facts",
      "outputLocale": "en"
    }
  }
}

Using SDK

You can use the Corti SDK (currently in “beta”) to send stream configuration.
You can provide the configuration either directly when connecting, or send it as a separate message after establishing the connection:
const configuration = {
  transcription: {
    primaryLanguage: "en",
    isDiarization: false,
    isMultichannel: false,
    participants: [
      {
        channel: 0,
        role: "multiple"
      }
    ]
  },
  mode: {
    type: "facts",
    outputLocale: "en"
  }
};

const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>",
  configuration
});

Sending Audio

Ensure that your configuration was accepted before sending audio, and that the initial audio chunk is not too small as it needs to contain the headers to properly decode the audio.We recommend sending audio in chunks of 250-500ms. In terms of buffering, the limit is 64000 bytes per chunk.Audio data should be sent as raw binary without JSON wrapping.
A variety of common audio formats are supported; audio will be passed through a transcoder before speech-to-text processing. Similarly, specification of sample rate, depth or other audio settings is not required at this time. See more details on supported audio formats here.

Channels, participants, and speakers

In most workflows, especially in-person settings, mono-channel audio should be used. If the microphone is a stereo-microphone, then ensure to set isMultichannel: false and audio will be converted to mono-channel, preventing duplicate transcripts from being returned. In a telehealth workflow, or other virtual setting, the virtual audio may be on one channel (e.g., from webRTC) with audio from the microphone of the local client on a separate channel. In this scenario, define isMultichannel: true and assign each channel the relevant participant role (e.g., if the doctor is on the local client, then set that to channel 0 with participant defined as doctor and the virtual audio for patient on channel defined as participantpatient`).
Diarization is independent of audio channels and participant roles as it enables speaker separation for mono audio.With configuration isDiarization: true, transcript segments will be assigned to automatically with first speaker identified being channel 0, second on channel 1, etc. If isDiarization:false, then transcript segments will all be assigned with speakerId: -1.Read more here.

Using SDK

You can use the Corti SDK (currently in “beta”) to send audio data to the stream.
To send audio, use the sendAudio method on the stream socket. Audio should be sent as binary chunks (e.g., ArrayBuffer):
streamSocket.sendAudio(chunk); // method doesn't do the chunking

Flush the Audio Buffer

To flush the audio buffer, forcing transcript segments to be returned over the web socket (e.g., when turning off or muting the microphone for the patient to share something private, not to be recorded, during the conversation), send a message -
{
  "type":"flush"
}
The server will return text for audio sent before the flush message and then respond with message -
{
  "type":"flushed"
}
The web socket will remain open so recording can continue. FactsR generation (i.e., when working in configuration.mode: facts) is not impacted by the flush event and will continue to process as normal.
Client side considerations:1 If you rely on a flush event to separate data (e.g., for different sections in an EHR template), then be sure to receive the flushed event before moving on to the next data field.2 When using a web browser MediaRecorder API, audio is buffered and only emitted at the configured timeslice interval. Therefore, before sending a flush message, call MediaRecorder.requestData() to force any remaining buffered audio on the client to be transmitted to the server. This ensures all audio reaches the server before the flush is processed.

Ending the Session

To end the /stream session, send a message -
{
"type": "end"
}
This will signal the server to send any remaining transcript segments and facts (depending on mode configuration). Then, the server will send two messages -
{
  "type":"usage",
  "credits":0.1
}
{
  "type":"ENDED"
}
Following the message type ENDED, the server will close the web socket.
You can at any time open the WebSocket again by sending the configuration.

Using SDK

You can use the Corti SDK (currently in “beta”) to control the stream status.
When using automatic configuration (passing configuration to connect), the socket will close itself without reconnecting when it receives an ENDED message. When using manual configuration, the socket will attempt to reconnect after the server closes the connection. To prevent this, you must subscribe to the ENDED message and manually close the connection.
const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>",
  configuration
});

streamSocket.sendEnd({ type: "end" });

streamSocket.on("message", (message) => {
  if (message.type === "usage") {
    console.log("Usage:", message);
  }

  // message is received, but connection closes automatically
  if (message.type === "ENDED") {
    console.log("ENDED:", message);
  }
});

4. Responses

Configuration

type
string
default:"CONFIG_ACCEPTED"
required
Returned when sending a valid configuration.
sessionId
uuid
required
Returned when sending a valid configuration.

Transcripts

type
string
default:"transcript"
required
data
object
required
Transcript response
{
  "type": "transcript",
  "data": [
    {
      "id": "UUID",
      "transcript": "Patient presents with fever and cough.",
      "time": { "start": 1.71, "end": 11.296 },
      "final": true,
      "speakerId": -1,
      "participant": { "channel": 0 }
    }
  ]
}

Facts

type
string
default:"facts"
required
facts
object
required
Fact response
{
  "type": "facts",
  "fact": [
    {
      "id": "UUID",
      "text": "Patient has a history of hypertension.",
      "group": "medical-history",
      "groupId": "UUID",
      "isDiscarded": false,
      "source": "core",
      "createdAt": "2024-02-28T12:34:56Z",
      "updatedAt": ""
    }
  ]
}
By default, incoming audio and returned data streams are persisted on the server, associated with the interactionId. You may query the interaction to retrieve the stored recordings, transcripts, and facts via the relevant REST endpoints. Audio recordings are saved as .webm format; transcripts and facts as json objects.Data persistence can be disabled by Corti upon request when needed to support compliance with your applicable regulations and data handling preferences.

Using SDK

You can use the Corti SDK (currently in “beta”) to subscribe to stream messages.
streamSocket.on("message", (message) => {
  // Distinguish message types
  switch (message.type) {
    case "transcript":
      // Handle transcript message
      console.log("Transcript:", message);
      break;
    case "facts":
      // Handle facts message
      console.log("Facts:", message);
      break;
    case "error":
      // Handle error message
      console.error("Error:", message);
      break;
    default:
      // Handle other message types
      console.log("Other message:", message);
  }
});

streamSocket.on("error", (error) => {
  // Handle error
  console.error(error);
});

streamSocket.on("close", () => {
  // Handle socket close
  console.log("Stream closed");
});

Flushed

type
string
default:"flushed"
required
Returned by server, after processing flush event from client, to return transcript segments
{
  "type":"flushed"
}

Ended

type
string
default:"usage"
required
Returned by server, after processing end event from client, to convey amount of credits consumed
{
  "type":"usage",
  "credits":0.1
}
type
string
default:"ENDED"
required
Returned by server, after processing end event from client, before closing the web socket
{
  "type":"ENDED"
}

5. Error Handling

In case of an invalid or missing interaction ID, the server will return an error before opening the WebSocket.
From opening the WebSocket, you need to commit the configuration within 15 seconds, else the WebSocket will close again
At the beginning of a WebSocket session the following messages related to configuration can be returned.

  {"type": "CONFIG_DENIED"} // in case the configuration is not valid
  {"type": "CONFIG_MISSING"}
  {"type": "CONFIG_NOT_PROVIDED"}
  {"type": "CONFIG_ALREADY_RECEIVED"}
In addition, a reason will be supplied, e.g. reason: language unavailable Once configuration has been accepted and the session is running, you may encounter runtime or application-level errors. These are sent as JSON objects with the following structure:
{
  "type": "error",
  "error": {
    "id": "error id",
    "title": "error title",
    "status": 400,
    "details": "error details",
    "doc":"link to documentation"
  }
}
In some cases, receiving an “error” type message will cause the stream to end and send a message of type usage and type ENDED.

Using SDK

You can use the Corti SDK (currently in “beta”) to handle error messages.
With recommended configuration, configuration errors (e.g., CONFIG_DENIED, CONFIG_MISSING, etc.) and runtime errors will both trigger the error event and automatically close the socket. You can also inspect the original message in the message handler. With manual configuration, configuration errors are only received as messages (not as error events), and you must close the socket manually to avoid reconnection.
const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>",
  configuration
});

streamSocket.on("error", (error) => {
  // Emitted for both configuration and runtime errors
  console.error("Error event:", error);
  // The socket will close itself automatically
});

// still can be accessed with normal "message" subscription
streamSocket.on("message", (message) => {
  if (
    message.type === "CONFIG_DENIED" ||
    message.type === "CONFIG_MISSING" ||
    message.type === "CONFIG_NOT_PROVIDED" ||
    message.type === "CONFIG_ALREADY_RECEIVED" ||
    message.type === "CONFIG_TIMEOUT"
  ) {
    console.log("Configuration error (message):", message);
  }

  if (message.type === "error") {
    console.log("Runtime error (message):", message);
  }
});