Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.corti.ai/llms.txt

Use this file to discover all available pages before exploring further.

Both the /transcribe and /streams endpoints use WebSocket Secure (WSS) connections to support real-time, bidirectional audio streaming and transcript delivery. This guide covers best practices for establishing, managing, and recovering WSS connections reliably in production integrations.
This guide applies to both the /transcribe and /streams endpoints. Where behaviour differs between the two, differences are called out explicitly.

Establishing a Connection

  • Always use the wss:// scheme. Plain ws:// is not supported.
  • Include both tenant-name and token (Bearer access token) as query parameters on the connection URL. Both are required for authentication.
  • For /streams, the interaction id must be included in the URL path. Create the interaction first via POST /interactions/ and use the websocketUrl returned in the response.
  • A 101 Switching Protocols response confirms a successful connection.

Configuration Sequencing

Send your configuration immediately after the connection is established. Do not send audio before configuring.
  • The type: "config" message must be sent within 10 seconds of opening the connection:
    • /transcribe times out with CONFIG_TIMEOUT
    • /streams times out with CONFIG_NOT_PROVIDED
  • Wait for CONFIG_ACCEPTED before transmitting any audio. Sending audio before the acknowledgement may result in data loss or unexpected behaviour.
  • If configuration is invalid, the server returns CONFIG_DENIED. Inspect the response for the specific reason and correct the configuration before reconnecting.
Specifying audioFormat in your configuration is strongly recommended. When omitted, the server attempts to auto-detect the format from the first audio chunk. Note that this can fail silently on unsupported formats. An unsupported MIME type returns CONFIG_REJECTED.

Audio Chunk Size

Chunk size directly affects transcript latency and connection stability.
  • Send small audio chunks during normal operation. Audio chunks of 250ms are recommended. Smaller chunks do not reduce transcript latency while adding more web traffic.
  • The maximum WebSocket message size is 1MB. Messages exceeding this limit may be rejected by the server or lead to transcript errors.
  • All WebSocket frames share the same underlying TCP connection. Sending large messages near the 1MB limit can briefly block other frames. Keep chunk sizes consistent during active streaming.

Handling buffered audio

If audio accumulates in a client-side buffer (e.g., due to a transient network dropout) do not flush the entire buffer as a single message. Instead:
  1. Break the buffered audio into sequential chunks of no more than 1MB each.
  2. Send them in order after CONFIG_ACCEPTED is received.
  3. Resume normal real-time streaming once the buffer is drained.
Sending a single message larger than 1MB will cause the server to reject the message or transcript segments to be dropped. Always chunk buffered audio before sending.

Handling Disconnections and Reconnecting

WebSocket connections are not automatically restored after a network dropout. Clients are responsible for detecting disconnection and reconnecting.

/transcribe

  • On disconnect, open a new WebSocket connection and re-send the full type: "config" message.
  • There is no persistent session state server-side - each new connection starts a fresh session.
  • After receiving CONFIG_ACCEPTED, send any audio that accumulated client-side during the dropout in chunks of no more than 1MB before resuming real-time streaming.

/streams

  • On disconnect, reconnect using the same interaction URL (same interaction id).
  • Each reconnection creates a new recording on the server. Both recordings are stored and associated with the same interaction. A single interaction may therefore have multiple recordings.
  • After receiving CONFIG_ACCEPTED on the new connection, send any buffered audio in chunks of no more than 1MB before resuming real-time streaming.

General reconnection guidance

  • Implement exponential backoff for reconnection attempts with initial delay of 200 milliseconds and 0-100 milliseconds of jitter thereafter. Avoid immediate retry loops after a dropout.
  • Always wait for CONFIG_ACCEPTED before sending any audio, including buffered audio accumulated before the disconnect.

Session Teardown

  • For /transcribe: send type: "end" to signal that the session is complete. The server will finalize outstanding audio processing and close the connection cleanly.
  • For /streams: send type: "end" to signal that the interaction audio is complete and trigger final processing.
  • Do not close the WebSocket abruptly. Ungraceful disconnection may result in incomplete transcripts or lost final results.

Error Handling Reference

Always implement listeners for all configuration response message types. Unhandled configuration errors leave the connection in an unusable state.
MessageEndpointMeaningAction
CONFIG_ACCEPTEDBothConfiguration valid, session activeBegin streaming audio
CONFIG_DENIEDBothConfiguration invalidInspect reason, correct config, reconnect
CONFIG_TIMEOUT/transcribeConfig not sent within 10sReconnect and send config immediately
CONFIG_NOT_PROVIDED/streamsConfig not sent within 10sReconnect and send config immediately
CONFIG_REJECTED/streamsUnsupported audioFormat MIME type providedCorrect the MIME type and reconnect
Please contact us for more information or help.