JavaScript SDK - WebSocket Guide - Corti API Documentation

The SDK wraps both WebSocket APIs — Stream and Transcribe — with a unified interface: a promise-based connect(), automatic reconnection, and typed message events.

Connecting

Both WebSocket APIs require a handshake before audio can flow. After the connection opens, the client must send a configuration message and wait for the server to respond with CONFIG_ACCEPTED. Only then is it safe to start streaming audio. If configuration is rejected, the session cannot be used. By default, the SDK handles this handshake for you. Pass configuration to connect() and the promise resolves only after CONFIG_ACCEPTED is received — or rejects (and closes the socket) if the configuration is refused.

Because connect() awaits the handshake before returning, your socket.on("message", ...) handler is registered after the config exchange completes. CONFIG_* messages are never visible in the message handler — they are consumed internally by the SDK.

Stream
Transcribe

const socket = await client.stream.connect({
    id: interactionId,
    configuration: {
        transcription: {
            primaryLanguage: "en",
            participants: [{ channel: 0, role: "doctor" }],
        },
        mode: { type: "facts", outputLocale: "en" },
    },
});

socket.on("message", (msg) => {
    console.log(msg.type, msg.data);
});

socket.sendAudio(audioBuffer);

JavaScript

const socket = await client.transcribe.connect({
  configuration: {
    primaryLanguage: "en",
    automaticPunctuation: true,
  },
});

socket.on("message", (message) => {
  if (message.type === "transcript") {
    console.log("Transcript:", message.data.text);
  }
});

// Send audio data (e.g. from a microphone stream)
socket.sendAudio(audioBuffer);

Getting the socket before the handshake completes

By default connect() waits for CONFIG_ACCEPTED before resolving. Set awaitConfiguration: false to get the socket returned immediately — before the WebSocket even opens. This lets you attach event handlers right away and guarantees you won’t miss any messages, including early config status events. You are then responsible for waiting for CONFIG_ACCEPTED before sending audio. If configuration is rejected, error events are emitted on the socket rather than the promise rejecting.

awaitConfiguration: false was the default behavior prior to v1.0.0. If you are migrating from an older version, set it explicitly to preserve the previous behavior.

const socket = await client.stream.connect({
    id: interactionId,
    configuration: {
        transcription: {
            primaryLanguage: "en",
            participants: [{ channel: 0, role: "doctor" }],
        },
        mode: { type: "facts", outputLocale: "en" },
    },
    awaitConfiguration: false, // socket returned as soon as the WebSocket opens
});

// Must wait for CONFIG_ACCEPTED before sending audio
socket.on("message", (msg) => {
    if (msg.type === "CONFIG_ACCEPTED") {
        console.log("Configuration accepted — ready to send audio");
        socket.sendAudio(audioBuffer);
    }
    if (msg.type === "CONFIG_DENIED") {
        console.error("Configuration denied:", msg.reason);
        socket.close();
    }
});

Connecting without configuration

If you want to manage the whole handshake flow manually and only use the SDK for types and reconnection, omit configuration from connect(). The socket opens asynchronously, so wait for it to reach OPEN and then call sendConfiguration() manually — you are responsible for waiting for CONFIG_ACCEPTED before sending audio.

const socket = await client.stream.connect({ id: interactionId });

// Wait for CONFIG_ACCEPTED before sending audio
socket.on("message", (msg) => {
    if (msg.type === "CONFIG_ACCEPTED") {
        console.log("Configuration accepted — ready to send audio");
        socket.sendAudio(audioBuffer);
    }
    if (msg.type === "CONFIG_DENIED") {
        console.error("Configuration denied:", msg.reason);
        socket.close();
    }
});

// The socket opens asynchronously; wait for OPEN before sending configuration
await socket.waitForOpen();

socket.sendConfiguration({
    type: "config",
    configuration: {
        transcription: {
            primaryLanguage: "en",
            participants: [{ channel: 0, role: "doctor" }],
        },
        mode: { type: "facts", outputLocale: "en" },
    },
});

Sending audio

Both APIs accept raw audio chunks via sendAudio():

socket.sendAudio(audioBuffer); // Buffer, Uint8Array, ArrayBuffer, etc.

The SDK does not chunk audio for you. Send chunks at your own cadence — 100–250 ms per chunk is typical.

Full example

Stream
Transcribe

JavaScript

import fs from "fs";
import { CortiClient } from "@corti/sdk";

// Replace these with your values
const ACCESS_TOKEN = "<your-access-token>";
const INTERACTION_ID = "<your-interaction-id>";

const client = new CortiClient({
  auth: {
    accessToken: ACCESS_TOKEN,
  },
});

let socket;

try {
  // Step 1: Connect and send config — SDK waits for CONFIG_ACCEPTED before resolving
  socket = await client.stream.connect({
    id: INTERACTION_ID,
    configuration: {
      transcription: {
        primaryLanguage: "en",
        diarize: false,
        isMultichannel: false,
        participants: [{ channel: 0, role: "multiple" }],
      },
      mode: {
        type: "facts",      // or "transcription" if you don't need facts
        outputLocale: "en",
      },
    },
  });

  console.log("✅ Connected — session ready");

  socket.on("message", (msg) => {
    switch (msg.type) {
      case "transcript":
        // Segments can arrive out of order across speakers — order by time.start
        [...msg.data]
          .sort((a, b) => a.time.start - b.time.start)
          .forEach((seg) => {
            console.log(`🗣  [${seg.time.start}s → ${seg.time.end}s] ${seg.transcript}`);
          });
        break;

      case "facts":
        msg.fact.forEach((fact) => {
          console.log(`💡 Fact [${fact.group}]: ${fact.text}`);
        });
        break;

      case "flushed":
        console.log("🔄 Buffer flushed");
        break;

      case "usage":
        console.log(`💳 Credits used: ${msg.credits}`);
        break;

      case "ENDED":
        console.log("🏁 Session ended — server closing socket");
        break;

      case "error":
        console.error("❌ Runtime error:", msg.error);
        break;
    }
  });

  socket.on("close", (code, reason) => {
    console.log(`🔌 Connection closed [${code}]: ${reason}`);
  });

  socket.on("error", (err) => console.error("🚨 Connection error:", err.message));

  // Step 2: Start sending audio now that config is accepted
  sendAudio();
} catch (err) {
  // CONFIG_DENIED, CONFIG_TIMEOUT, or connection failure
  console.error("❌ Failed to connect:", err);
  throw err;
}

// --- Audio sending ---

function sendAudio() {
  const AUDIO_FILE = "./sample.webm"; // swap with your audio file path

  if (!fs.existsSync(AUDIO_FILE)) {
    console.warn("⚠️  No audio file found — sending silence simulation");
    simulateAudioAndEnd();
    return;
  }

  const audioBuffer = fs.readFileSync(AUDIO_FILE);
  const CHUNK_SIZE = 8192; // ~250–500ms chunks recommended
  let offset = 0;

  console.log(`🎙  Streaming ${audioBuffer.length} bytes of audio...`);

  const interval = setInterval(() => {
    if (offset >= audioBuffer.length) {
      clearInterval(interval);
      console.log("✅ All audio sent");
      endSession();
      return;
    }

    socket.sendAudio(audioBuffer.slice(offset, offset + CHUNK_SIZE));
    offset += CHUNK_SIZE;
  }, 300); // send a chunk every 300ms
}

function simulateAudioAndEnd() {
  setTimeout(() => endSession(), 2000);
}

// --- Optional: flush the audio buffer mid-session ---
function flushBuffer() {
  socket.sendFlush({ type: "flush" });
  console.log("📤 Sent flush");
}

// --- End the session ---
function endSession() {
  socket.sendEnd({ type: "end" });
  console.log("📤 Sent end — waiting for ENDED...");
}

JavaScript

import fs from "fs";
import { CortiClient } from "@corti/sdk";

// Replace these with your values
const ACCESS_TOKEN = "<your-access-token>";

const client = new CortiClient({
  auth: {
    accessToken: ACCESS_TOKEN,
  },
});

let socket;

try {
  // Step 1: Connect and send config — SDK waits for CONFIG_ACCEPTED before resolving
  socket = await client.transcribe.connect({
    configuration: {
      primaryLanguage: "en",
      automaticPunctuation: true,
      formatting: {
        numbers: "numerals_above_nine",
        measurements: "abbreviated",
      },
    },
  });

  console.log("✅ Connected — session ready");

  socket.on("message", (msg) => {
    switch (msg.type) {
      case "transcript":
        if (msg.data.isFinal) {
          console.log(`🗣  [${msg.data.start}s → ${msg.data.end}s] ${msg.data.text}`);
        } else {
          console.log(`💬 Interim: ${msg.data.text}`);
        }
        break;
      case "command":
        console.log(`🎙  Command detected [${msg.data.id}]:`, msg.data.variables);
        break;
      case "flushed":
        console.log("🔄 Buffer flushed");
        break;
      case "usage":
        console.log(`💳 Credits used: ${msg.credits}`);
        break;
      case "ended":
        console.log("🏁 Session ended — server closing socket");
        break;
      case "error":
        console.error("❌ Runtime error:", msg.error);
        break;
    }
  });

  socket.on("close", (code, reason) => {
    console.log(`🔌 Connection closed [${code}]: ${reason}`);
  });

  socket.on("error", (err) => console.error("🚨 Connection error:", err.message));

  // Step 2: Start sending audio now that config is accepted
  sendAudio();
} catch (err) {
  // CONFIG_DENIED, CONFIG_TIMEOUT, or connection failure
  console.error("❌ Failed to connect:", err);
  throw err;
}

// --- Audio sending ---

function sendAudio() {
  const AUDIO_FILE = "./sample.webm"; // swap with your audio file path

  if (!fs.existsSync(AUDIO_FILE)) {
    console.warn("⚠️  No audio file found — sending silence simulation");
    simulateAudioAndEnd();
    return;
  }

  const audioBuffer = fs.readFileSync(AUDIO_FILE);
  const CHUNK_SIZE = 8192; // ~250–500ms per chunk

  for (let i = 0; i < audioBuffer.length; i += CHUNK_SIZE) {
    socket.sendAudio(audioBuffer.slice(i, i + CHUNK_SIZE));
  }

  // Signal end of audio stream
  socket.sendEnd({ type: "end" });
  console.log("📤 Audio sent — end signal dispatched");
}

function simulateAudioAndEnd() {
  socket.sendAudio(Buffer.alloc(8192));
  socket.sendEnd({ type: "end" });
}

Next steps (sending messages, lifecycle)

After the socket is OPEN and configuration has been accepted, you can use the SDK’s methods to send audio, flush, end, and subscribe to typed messages.

SDK method reference: See the socket methods and connect() parameters in the JavaScript SDK reference.
Protocol behavior & message semantics: If you want to understand what each message means (and the underlying wire format), refer to the WebSocket API docs.
- Stream WSS API reference
- Transcribe WSS API reference

Resources

Stream API reference — WebSocket protocol details and message schemas
Transcribe API reference — Transcribe protocol and configuration options
Proxy Guide — Route WebSocket connections through your own server
Authentication Guide — All auth flows including scoped tokens for WebSocket

For support or questions, contact us

Get started

SDKs & tools

Implementation guides

Release notes

Resources

JavaScript SDK - WebSocket Guide

Connecting

Getting the socket before the handshake completes

Connecting without configuration

Sending audio

Full example

Next steps (sending messages, lifecycle)

Resources

​Connecting

​Getting the socket before the handshake completes

​Connecting without configuration

​Sending audio

​Full example

​Next steps (sending messages, lifecycle)

​Resources

Connecting

Getting the socket before the handshake completes

Connecting without configuration

Sending audio

Full example

Next steps (sending messages, lifecycle)

Resources