> ## Documentation Index
> Fetch the complete documentation index at: https://docs.corti.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Streams endpoint

> Real-time conversational clinical intelligence

Interested in building a real-time intelligence or clinical decision support solution? Look no further than the Corti AI Streams API: Real-time, bidirectional  WebSocket Secure (WSS) communication with the Corti AI platform for live transcript generation and clinical fact extraction ([FactsR™](/textgen/factsr)).

<Tip>**Connecting from a browser?** Never expose `client_id`/`client_secret` or a full-scope `access_token` to the frontend. Issue a **limited-scope token** with `scope="openid streams"` from your backend so the token can only be used against this WebSocket endpoint. See [Limited-scope credentials for streaming APIs](/authentication/security_best_practices#4-if-you-must-use-tokens-in-special-cases-use-limited-scope-credentials).</Tip>

<Card>
  Delivered as a modular API, `/streams` provides a stateful interaction framework for developers to embed clinical-grade intelligence directly into their healthcare applications, creating safer, leaner, and more trusted AI experiences at the point of care.

  <Tip>
    Depending on your use case, the streams endpoint can be used for real-time conversational transcript, fact extraction, or both! See the full API specification [here](/api-reference/streams).
  </Tip>
</Card>

***

## Using the API

<Steps>
  <Step>
    Initiate a `/streams` by creating an Interaction, which will return a `web socket URL` along with the `interactionId`.
  </Step>

  <Step>
    Connect to the web socket and set your configuration:

    | Parameter           | Description                                                                                                                                   |
    | :------------------ | :-------------------------------------------------------------------------------------------------------------------------------------------- |
    | `primaryLanguage`   | Spoken language to be transcribed                                                                                                             |
    | `diarize`           | Enable speaker separation (most useful on single channel audio)<br /><Info>Note the legacy parameter `isDiarization` is still accepted</Info> |
    | `isMultichannel`    | Enable multichannel audio.                                                                                                                    |
    | `participants`      | Assign speaker roles for audio channels. Must be used in conjunction with `multichannel: true`.                                               |
    | `mode.type`         | Define `facts` or `transcription` depending on the desired real-time output                                                                   |
    | `mode.outputLocale` | Output language for extracted `facts` (required with `"type":"facts"`)                                                                        |

    <Note>See detailed configuration options [here](/api-reference/streams#configuration)</Note>
  </Step>

  <Step>
    Once config is accepted, begin sending audio packets.
  </Step>

  <Step>
    Receive transcripts every \~3 seconds and facts every \~60 seconds (standard response times can adapted for custom response times).
  </Step>

  <Step>
    Send the `end` message to close the audio stream.
  </Step>
</Steps>

***

## Features

<Tip> Click on the cards to learn more...</Tip>

### Languages

<Card href="/stt/languages/">
  Corti speech to text is specifically designed for use in the healthcare domain. A tier system has been introduced to categorize functionality and performance that is available per language and endpoint. Languages in the Enhanced and Premier  tiers have the utmost functionality and recognition accuracy - they're the ones recommended for dictation use.
</Card>

### Audio Configuration

<Card href="/stt/audio/">
  With support for mono or multi-channel audio, with live transcoding and a variety of file formats to choose from, don't let the complexities fo audio capture and processing inhibit opportunities for real-time intelligence. Read more about our recommendations and best practices.
</Card>

### FactsR

<Card href="/textgen/factsr/">
  FactsR™ is a real-time agentic reasoning system for clinical consultations. Designed with ambient documentation in mind, FactsR reduces general purpose AI driven “note bloat” by 65 percent, keeping records precise, relevant, and tightly aligned with the actual clinical conversation.
</Card>

### Formatting

<Card href="/stt/formatting/">
  Speech to text can be used to create a verbatim transcript of the audio; however, some content is not documented in the same manner as it is verbalized. The `formatting` features assures that key information (dates, numbers, measurements, etc.) are output as expected in the transcript. <br /><sub>*Server defaults are applied and configuration of formatting preferences is not currently exposed through this endpoint as with `/transcribe`.*</sub>
</Card>

### Diarization

<Card href="/stt/diarization/">
  Diarization is the process of segmenting an audio recording by speaker, assigning portions of speech to distinct identities (e.g., “Doctor,” “Patient”). This enables accurate transcription, attribution, and analysis of multi-speaker clinical conversations, but is not required for effective AI scribing or workflow speech-enablement.
</Card>

### Audio Events

<Card href="/stt/audio-events/">
  Real-time events during audio streaming about quality and speech activity, intended to notify integrator of audio health degradation, periods of silence, or other events that could support application behavior or user warnings.
</Card>

### Replacements

<Card href="/stt/replacements/">
  `coming soon`<br />Ability to define words or phrases that should be returned in place of the standard output by the speech-to-text model.
</Card>

### Keyterms

<Card href="/stt/keyterms/">
  Bias speech-to-text output so that new words can be introduced to the system vocabulary (e.g., surnames) or to improve recognition reliability for homophones and words with ambiguous pronunciation.
</Card>

<br />

<Note>
  Please [contact us](mailto:help@corti.ai) for more information or help.
</Note>