Diarized Transcript Handling

When diarization is enabled (diarize: true) on /streams, the server attributes speech to distinct speakers. Because segments for different speakers are finalized independently, type: "transcript" messages are not guaranteed to arrive in chronological order over the WebSocket.

This applies to multi-speaker /streams sessions only when diarization is enabled. The /transcribe endpoint is single-speaker only.

Diarized Transcript Segment Ordering

Each transcript message carries a data array, and multiple segments for different speakers can arrive in the same message. Ordering of transcript text should be mindful of both speaker and speech time: A later-spoken segment for one speaker can also arrive before an earlier-spoken segment from another.

Only final segments are returned on /streams (final: true); there are no interim results to reconcile. The handling concern is purely ordering, not interim-vs-final dedup as is required with dictation transcript handling.

Each segment in data includes a time object you can use to recover the true order:

Field	Use
`time.start`	Primary ordering key — when the speech began (seconds)
`time.end`	Secondary key for tie-breaks and overlap checks (seconds)
`speakerId`	A distinct integer per detected speaker (up to four). Returns `-1` when diarization is disabled.
`participant.channel`	Audio channel the segment was attributed to

speakerId and participant.channel are independent concepts. Diarization separates speakers within a transcribed stream; the channel reflects audio routing. Do not assume a fixed mapping between the two.

Rendering segments in socket-arrival order may interleave one speaker’s sentence into the middle of another’s. In such a case, no words are lost, but the sequence may be unreliable. Always order segments by time.start.

Recommendation

Position each segment by time.start, not by arrival order. Insert into the running transcript at the position determined by time.start rather than appending to the end.
Use time.end as a secondary key for tie-breaks.
Iterate the full data array on every message and order before rendering — do not assume a single segment per message.

case "transcript":
  // message.data is an array — order by start time before committing to the UI
  const ordered = [...message.data].sort(
    (a, b) => a.time.start - b.time.start || a.time.end - b.time.end
  );
  ordered.forEach((seg) => {
    insertByStartTime(seg); // position by seg.time.start, do not blindly append
  });
  break;

Please contact us if you need further assistance working with diarized transcripts.

Endpoints

Features

Best Practices

Guides

Resources

Diarized Transcript Segment Ordering

Recommendation

​Diarized Transcript Segment Ordering

​Recommendation

Diarized Transcript Segment Ordering

Recommendation