Skip to main content
The SDK wraps both WebSocket APIs — Stream and Transcribe — with a unified interface: a promise-based connect(), automatic reconnection, and typed message events.

Connecting

Both WebSocket APIs require a handshake before audio can flow. After the connection opens, the client must send a configuration message and wait for the server to respond with CONFIG_ACCEPTED. Only then is it safe to start streaming audio. If configuration is rejected (CONFIG_DENIED, CONFIG_TIMEOUT), the session cannot be used. By default, the SDK handles this handshake for you. Pass configuration to connect() and the promise resolves only after CONFIG_ACCEPTED is received — or rejects (and closes the socket) if the configuration is refused.
Because connect() awaits the handshake before returning, your socket.on("message", ...) handler is registered after the config exchange completes. CONFIG_* messages are never visible in the message handler — they are consumed internally by the SDK.
const socket = await client.stream.connect({
    id: interactionId,
    configuration: {
        transcription: {
            primaryLanguage: "en",
            participants: [{ channel: 0, role: "doctor" }],
        },
        mode: { type: "facts", outputLocale: "en" },
    },
});

socket.on("message", (msg) => {
    console.log(msg.type, msg.data);
});

socket.sendAudio(audioBuffer);

Getting the socket before the handshake completes

By default connect() waits for CONFIG_ACCEPTED before resolving. Set awaitConfiguration: false to get the socket returned immediately — before the WebSocket even opens. This lets you attach event handlers right away and guarantees you won’t miss any messages, including early config status events. You are then responsible for waiting for CONFIG_ACCEPTED before sending audio. If configuration is rejected, error events are emitted on the socket rather than the promise rejecting.
awaitConfiguration: false was the default behavior prior to v1.0.0. If you are migrating from an older version, set it explicitly to preserve the previous behavior.
const socket = await client.stream.connect({
    id: interactionId,
    configuration: {
        transcription: {
            primaryLanguage: "en",
            participants: [{ channel: 0, role: "doctor" }],
        },
        mode: { type: "facts", outputLocale: "en" },
    },
    awaitConfiguration: false, // socket returned as soon as the WebSocket opens
});

// Must wait for CONFIG_ACCEPTED before sending audio
socket.on("message", (msg) => {
    if (msg.type === "CONFIG_ACCEPTED") {
        console.log("Configuration accepted — ready to send audio");
        socket.sendAudio(audioBuffer);
    }
    if (msg.type === "CONFIG_DENIED") {
        console.error("Configuration denied:", msg.reason);
        socket.close();
    }
});

Connecting without configuration

If you want to manage the whole handshake flow manually and only use the SDK for types and reconnection, omit configuration from connect(). The socket opens asynchronously, so wait for it to reach OPEN and then call sendConfiguration() manually — you are responsible for waiting for CONFIG_ACCEPTED before sending audio.
const socket = await client.stream.connect({ id: interactionId });

// Wait for CONFIG_ACCEPTED before sending audio
socket.on("message", (msg) => {
    if (msg.type === "CONFIG_ACCEPTED") {
        console.log("Configuration accepted — ready to send audio");
        socket.sendAudio(audioBuffer);
    }
    if (msg.type === "CONFIG_DENIED") {
        console.error("Configuration denied:", msg.reason);
        socket.close();
    }
});

// The socket opens asynchronously; wait for OPEN before sending configuration
await socket.waitForOpen();

socket.sendConfiguration({
    type: "config",
    configuration: {
        transcription: {
            primaryLanguage: "en",
            participants: [{ channel: 0, role: "doctor" }],
        },
        mode: { type: "facts", outputLocale: "en" },
    },
});

Sending audio

Both APIs accept raw audio chunks via sendAudio():
socket.sendAudio(audioBuffer); // Buffer, Uint8Array, ArrayBuffer, etc.
The SDK does not chunk audio for you. Send chunks at your own cadence — 100–250 ms per chunk is typical.

Full example

JavaScript
import fs from "fs";
import { CortiClient } from "@corti/sdk";

const client = new CortiClient({
  auth: {
    accessToken: "YOUR_ACCESS_TOKEN",
  },
});

// Interaction must be created via REST before opening a stream
const INTERACTION_ID = "YOUR_INTERACTION_UUID";

let socket;

try {
  // Step 1: Connect and send config — SDK waits for CONFIG_ACCEPTED before resolving
  socket = await client.stream.connect({
    id: INTERACTION_ID,
    configuration: {
      transcription: {
        primaryLanguage: "en",
        isDiarization: false,
        isMultichannel: false,
        participants: [{ channel: 0, role: "multiple" }],
      },
      mode: {
        type: "facts",      // or "transcription" if you don't need facts
        outputLocale: "en",
      },
    },
  });

  console.log("✅ Connected — session ready");

  socket.on("message", (msg) => {
    switch (msg.type) {
      case "transcript":
        msg.data.forEach((seg) => {
          console.log(`🗣  [${seg.time.start}s → ${seg.time.end}s] ${seg.transcript}`);
        });
        break;

      case "facts":
        msg.fact.forEach((fact) => {
          console.log(`💡 Fact [${fact.group}]: ${fact.text}`);
        });
        break;

      case "flushed":
        console.log("🔄 Buffer flushed");
        break;

      case "usage":
        console.log(`💳 Credits used: ${msg.credits}`);
        break;

      case "ENDED":
        console.log("🏁 Session ended — server closing socket");
        break;

      case "error":
        console.error("❌ Runtime error:", msg.error);
        break;
    }
  });

  socket.on("close", (code, reason) => {
    console.log(`🔌 Connection closed [${code}]: ${reason}`);
  });

  socket.on("error", (err) => console.error("🚨 Connection error:", err.message));

  // Step 2: Start sending audio now that config is accepted
  sendAudio();
} catch (err) {
  // CONFIG_DENIED, CONFIG_TIMEOUT, or connection failure
  console.error("❌ Failed to connect:", err);
  throw;
}

// --- Audio sending ---

function sendAudio() {
  const AUDIO_FILE = "./sample.webm"; // swap with your audio file path

  if (!fs.existsSync(AUDIO_FILE)) {
    console.warn("⚠️  No audio file found — sending silence simulation");
    simulateAudioAndEnd();
    return;
  }

  const audioBuffer = fs.readFileSync(AUDIO_FILE);
  const CHUNK_SIZE = 8192; // ~250–500ms chunks recommended
  let offset = 0;

  console.log(`🎙  Streaming ${audioBuffer.length} bytes of audio...`);

  const interval = setInterval(() => {
    if (offset >= audioBuffer.length) {
      clearInterval(interval);
      console.log("✅ All audio sent");
      endSession();
      return;
    }

    socket.sendAudio(audioBuffer.slice(offset, offset + CHUNK_SIZE));
    offset += CHUNK_SIZE;
  }, 300); // send a chunk every 300ms
}

function simulateAudioAndEnd() {
  setTimeout(() => endSession(), 2000);
}

// --- Optional: flush the audio buffer mid-session ---
function flushBuffer() {
  socket.sendFlush({ type: "flush" });
  console.log("📤 Sent flush");
}

// --- End the session ---
function endSession() {
  socket.sendEnd({ type: "end" });
  console.log("📤 Sent end — waiting for ENDED...");
}

Next steps (sending messages, lifecycle)

After the socket is OPEN and configuration has been accepted, you can use the SDK’s methods to send audio, flush, end, and subscribe to typed messages.

Resources


For support or questions, reach out through help.corti.app