Skip to main content
When diarization is enabled (isDiarization: true) on /streams, the server attributes speech to distinct speakers. Because segments for different speakers are finalized independently, type: "transcript" messages are not guaranteed to arrive in chronological order over the WebSocket.
This applies to multi-speaker /streams sessions only when diarization is enabled. The /transcribe endpoint is single-speaker only.

Diarized Transcript Segment Ordering

Each transcript message carries a data array, and multiple segments for different speakers can arrive in the same message. Ordering of transcript text should be mindful of both speaker and speech time: A later-spoken segment for one speaker can also arrive before an earlier-spoken segment from another.
Only final segments are returned on /streams (final: true); there are no interim results to reconcile. The handling concern is purely ordering, not interim-vs-final dedup as is required with dictation transcript handling).
Each segment in data includes a time object you can use to recover the true order:
FieldUse
time.startPrimary ordering key — when the speech began (seconds)
time.endSecondary key for tie-breaks and overlap checks (seconds)
speakerIdA distinct integer per detected speaker (up to four). Returns -1 when diarization is disabled.
participant.channelAudio channel the segment was attributed to
speakerId and participant.channel are independent concepts. Diarization separates speakers within a transcribed stream; the channel reflects audio routing. Do not assume a fixed mapping between the two.
Rendering segments in socket-arrival order may interleave one speaker’s sentence into the middle of another’s. In such a case, no words are lost, but the sequence may be unreliable. Always order segments by time.start.

Recommendation

  • Position each segment by time.start, not by arrival order. Insert into the running transcript at the position determined by time.start rather than appending to the end.
  • Use time.end as a secondary key for tie-breaks.
  • Iterate the full data array on every message and order before rendering — do not assume a single segment per message.
case "transcript":
  // message.data is an array — order by start time before committing to the UI
  const ordered = [...message.data].sort(
    (a, b) => a.time.start - b.time.start || a.time.end - b.time.end
  );
  ordered.forEach((seg) => {
    insertByStartTime(seg); // position by seg.time.start, do not blindly append
  });
  break;
Please contact us if you need further assistance working with diarized transcripts.