Real-time conversational transcript generation and fact extraction (FactsR™)

Overview

The WebSocket Secure (WSS) /stream API enables real-time, bidirectional communication with the Corti system for interaction streaming. Clients can send and receive structured data, including transcripts and fact updates. Learn more about FactsR™ here. This documentation provides a structured guide for integrating the Corti WSS API for real-time interaction streaming.

This /stream endpoint supports real-time ambient documentation interactions and clinical decision support workflows.

If you are looking for a stateless endpoint that is geared towards front-end dictation workflows you should use the /transcribe WSS
If you are looking for asynchronous ambient documentation interactions, then please refer to the /documents endpoint

Environment Options

Environment	Description
`us`	US-based instance
`eu`	EU-based instance

Establishing a Connection

Clients must initiate a WebSocket connection using the wss:// scheme and provide a valid interaction ID in the URL.

When creating an interaction, the 200 response provides a websocketUrl for that interaction including the tenant-name as url parameter. The authentication for the WSS stream requires in addition to the tenant-name parameter a token parameter to pass in the Bearer access token.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to connect to a stream endpoint.

import { CortiClient, CortiEnvironment } from "@corti/sdk";

const cortiClient = new CortiClient({
    tenantName: "YOUR_TENANT_NAME",
    environment: CortiEnvironment.BetaEu,
    auth: {
        accessToken: "YOUR_ACCESS_TOKEN"
    },
});

const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>"
});

Request

Path Parameters

uuid

required

Unique interaction identifier

Query Parameters

tenant-name

string

required

Specifies the tenant context

token

string

required

Bearer $token

Responses

101 Switching Protocols

Indicates a successful WebSocket connection. Once connected, the server streams data in the following formats.

By default, returned data streams as well as incoming audio are being saved. Based on the interactionId you can find the saved transcripts and facts and recording(s) in the relevant REST endpoints. Audio recordings are saved as .webm format. This can be configured by Corti to be turned off to ensure you can comply with your applicable regulations and data handling preferences.

Data Streams

Transcript Stream

Property	Type	Description
`type`	string	”transcript”
`data`	array of objects	Transcript segments
`data[].id`	string	Unique identifier for the transcript
`data[].transcript`	string	The transcribed text
`data[].final`	boolean	Indicates whether the transcript is finalized or interim
`data[].speakerId`	integer	Speaker identifier (-1 if diarization is off)
`data[].participant.channel`	integer	Audio channel number (e.g. 0 or 1)
`data[].time.start`	number	Start time of the transcript segment
`data[].time.end`	number	End time of the transcript segment

{
  "type": "transcript",
  "data": [
    {
      "id": "UUID",
      "transcript": "Patient presents with fever and cough.",
      "final": true,
      "speakerId": -1,
      "participant": { "channel": 0 },
      "time": { "start": 1.71, "end": 11.296 }
    }
  ]
}

Fact Stream

Property	Type	Description
`type`	string	”facts”
`fact`	array of objects	Fact objects
`fact[].id`	string	Unique identifier for the fact
`fact[].text`	string	Text description of the fact
`fact[].group`	string	Categorization of the fact (e.g., “medical-history”)
`fact[].groupId`	string	Unique identifier for the group
`fact[].isDiscarded`	boolean	Indicates if the fact was discarded
`fact[].source`	string	Source of the fact (e.g., “core”)
`fact[].createdAt`	string (date-time)	Timestamp when the fact was created
`fact[].updatedAt`	string or null (date-time)	Timestamp when the fact was last updated

{
  "type": "facts",
  "fact": [
    {
      "id": "UUID",
      "text": "Patient has a history of hypertension.",
      "group": "medical-history",
      "groupId": "UUID",
      "isDiscarded": false,
      "source": "core",
      "createdAt": "2024-02-28T12:34:56Z",
      "updatedAt": "2024-02-28T12:35:56Z"
    }
  ]
}

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to subscribe to stream messages.

streamSocket.on("message", (message) => {
  // Distinguish message types
  switch (message.type) {
    case "transcript":
      // Handle transcript message
      console.log("Transcript:", message);
      break;
    case "facts":
      // Handle facts message
      console.log("Facts:", message);
      break;
    case "error":
      // Handle error message
      console.error("Error:", message);
      break;
    default:
      // Handle other message types
      console.log("Other message:", message);
  }
});

streamSocket.on("error", (error) => {
  // Handle error
  console.error(error);
});

streamSocket.on("close", () => {
  // Handle socket close
  console.log("Stream closed");
});

Sending Messages

Clients must send a stream configuration message and wait for a response of type CONFIG_ACCEPTED before transmitting other data.

Stream Configuration

Property	Type	Required	Description
`type`	string	Yes	”config”
`configuration`	object	Yes	Configuration settings
`configuration.transcription.primaryLanguage`	string (enum)	Yes	Primary spoken language for transcription
`configuration.transcription.isDiarization`	boolean	No - `false`	Enable speaker diarization
`configuration.transcription.isMultichannel`	boolean	No - `false`	Enable multi-channel audio processing
`configuration.transcription.participants`	array	Yes	List of participants with roles assigned to a channel
`configuration.transcription.participants[].channel`	integer	Yes	Audio channel number (e.g. 0 or 1)
`configuration.transcription.participants[].role`	string (enum)	Yes	”doctor”, “patient”, or “multiple”
`configuration.mode.type`	string (enum)	Yes	”facts” or “transcription”
`configuration.mode.outputLocale`	string (enum)	No	Output language locale (required for `facts`)

Example Configuration

{
  "type": "config",
  "configuration": {
    "transcription": {
      "primaryLanguage": "en",
      "isDiarization": false,
      "isMultichannel": false,
      "participants": [
        {
          "channel": 0,
          "role": "multiple"
        }
      ]
    },
    "mode": {
      "type": "facts",
      "outputLocale": "en"
    }
  }
}

Once the server responds with:

{
  "type": "CONFIG_ACCEPTED"
}

Clients can proceed with sending audio or controlling the stream status.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to send stream configuration.

You can provide the configuration either directly when connecting, or send it as a separate message after establishing the connection:

const configuration = {
  transcription: {
    primaryLanguage: "en",
    isDiarization: false,
    isMultichannel: false,
    participants: [
      {
        channel: 0,
        role: "multiple"
      }
    ]
  },
  mode: {
    type: "facts",
    outputLocale: "en"
  }
};

const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>",
  configuration
});

Controlling Stream Status

To end the stream, send:

{
  "type": "end"
}

The connection remains open until all transcripts are complete. The server then sends a message of type: "ENDED" and closes the connection.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to control the stream status.

When using automatic configuration (passing configuration to connect), the socket will close itself without reconnecting when it receives an ENDED message. When using manual configuration, the socket will attempt to reconnect after the server closes the connection. To prevent this, you must subscribe to the ENDED message and manually close the connection.

const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>",
  configuration
});

streamSocket.sendEnd({ type: "end" });

Sending Audio Data

Ensure that your configuration was accepted before starting to send audio and that your initial audio chunk is not too small as it needs to contain the headers to properly decode the audio. We recommend sending audio in chunks of 500ms. In terms of buffering, the limit is 64000 bytes per chunk. Audio data should be sent as raw binary without JSON wrapping. While we for bandwidth and efficiency reasons recommend utilizing the webm/opus encoding, you can send a variety of common audio formats as the audio you send first passes through a transcoder. Similarly, you do not need to specify any sample rate, depth or other audio settings.

Channels, participants and speakers

In a typical on-site setting you will be sending mono-channel audio. If the microphone is a stereo-microphone, you can ensure to set isMultichannel: false and audio will be converted to mono-channel, ensuring no duplicate transcripts are being returned. In a virtual setting such as telehealth, you would typically have the virtual audio on one channel from webRTC and mix in on a separate channel the microphone of the local client. In this scenario, define isMultichannel: true and assign each channel the relevant participant role, e.g. if the doctor is on the local client and channel 0, then you can set the role for channel 0 to doctor. Diarization is independent of audio channels and participant roles. If you want transcript segments to be assigned to automatically identified speakers, set isDiarization: true. If false, transcript segments will be returned with speakerId: -1. If set to true, then diarization will try to identify speakers separately on each channel. The first identified speaker on each channel will have transcript segments with speakerId: 0, the second speakerId: 1 and so forth.

SpeakerIds are not related or matched to participant roles.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to send audio data to the stream.

To send audio, use the sendAudio method on the stream socket. Audio should be sent as binary chunks (e.g., ArrayBuffer):

streamSocket.sendAudio(chunk); // method doesn't do the chunking

Error Handling

In case of an invalid or missing interaction ID, the server will return an error before opening the WebSocket.

From opening the WebSocket, you need to commit the configuration within 15 seconds, else the WebSocket will close again

At the beginning of a WebSocket session the following messages related to configuration can be returned.


  {"type": "CONFIG_DENIED"} // in case the configuration is not valid
  {"type": "CONFIG_MISSING"}
  {"type": "CONFIG_NOT_PROVIDED"}
  {"type": "CONFIG_ALREADY_RECEIVED"}

In addition, a reason will be supplied, e.g. reason: language unavailable Once configuration has been accepted and the session is running, you may encounter runtime or application-level errors. These are sent as JSON objects with the following structure:

{
  "type": "error",
  "error": {
    "id": "error id",
    "title": "error title",
    "status": 400,
    "details": "error details",
    "doc":"link to documentation"
  }
}

In some cases, receiving an “error” type message will cause the stream to end and send a message of type usage and type ENDED.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to handle error messages.

With recommended configuration, configuration errors (e.g., CONFIG_DENIED, CONFIG_MISSING, etc.) and runtime errors will both trigger the error event and automatically close the socket. You can also inspect the original message in the message handler. With manual configuration, configuration errors are only received as messages (not as error events), and you must close the socket manually to avoid reconnection.

const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>",
  configuration
});

streamSocket.on("error", (error) => {
  // Emitted for both configuration and runtime errors
  console.error("Error event:", error);
  // The socket will close itself automatically
});

// still can be accessed with normal "message" subscription
streamSocket.on("message", (message) => {
  if (
    message.type === "CONFIG_DENIED" ||
    message.type === "CONFIG_MISSING" ||
    message.type === "CONFIG_NOT_PROVIDED" ||
    message.type === "CONFIG_ALREADY_RECEIVED" ||
    message.type === "CONFIG_TIMEOUT"
  ) {
    console.log("Configuration error (message):", message);
  }

  if (message.type === "error") {
    console.log("Runtime error (message):", message);
  }
});

Closing the Connection

To terminate the WebSocket session, send a standard WebSocket close frame, or use:

{
  "type": "end"
}

The connection remains open until all transcripts are complete, at which point the server sends a usage message

{
  "type":"usage",
  "credits":0.1
}

, then a message of type ENDED, and then closes.

You can at any time open the WebSocket again and resume by sending the configuration.

Using SDK

You can use the Corti SDK (currently in “alpha”, not for production use) to control the stream status.

const streamSocket = await cortiClient.stream.connect({
  id: "<interactionId>",
  configuration
});

streamSocket.sendEnd({ type: "end" });

streamSocket.on("message", (message) => {
  if (message.type === "usage") {
    console.log("Usage:", message);
  }

  // message is received, but connection closes automatically
  if (message.type === "ENDED") {
    console.log("ENDED:", message);
  }
});

Get Started

Live Streams

interactions

recordings

transcripts

facts

documents

templates

codes

alignment

classification

contextual

explainability

swagger

​Overview

​Environment Options

​Establishing a Connection

​Using SDK

​Request

​Path Parameters

​Query Parameters

​Responses

​101 Switching Protocols

​Data Streams

​Transcript Stream

​Fact Stream

​Using SDK

​Sending Messages

​Stream Configuration

​Example Configuration

​Using SDK

​Controlling Stream Status

​Using SDK

​Sending Audio Data

​Channels, participants and speakers

​Using SDK

​Error Handling

​Using SDK

​Closing the Connection

​Using SDK

Overview

Environment Options

Establishing a Connection

Using SDK

Request

Path Parameters

Query Parameters

Responses

101 Switching Protocols

Data Streams

Transcript Stream

Fact Stream

Using SDK

Sending Messages

Stream Configuration

Example Configuration

Using SDK

Controlling Stream Status

Using SDK

Sending Audio Data

Channels, participants and speakers

Using SDK

Error Handling

Using SDK

Closing the Connection

Using SDK