> ## Documentation Index
> Fetch the complete documentation index at: https://docs.corti.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Recordings and Transcripts

> Speech to text via batch audio file processing

Speech to text is a machine learning technology that provides the ability to transform individual or conversational spoken audio into text. Such speech-to-text capability has been used in medical transcription workflows for decades. While the capabilities and opportunities for real-time audio streaming are vast, audio file-based transcription is still necessary in many clinical workflows.

<Card>
  This page explains key functionality provided by the Corti audio file processing via the `/recordings` and `/transcripts` endpoints.

  API specifications: [Upload recording](/api-reference/recordings/upload-recording) and [create transcript](/api-reference/transcripts/create-transcript).

  <Tip>
    Transcript creation is a synchronous-to-asynchronous workflow - see below how to upload audio files, create, and receive transcripts.

    Read more about real-time audio streaming, here - [/transcribe](/stt/transcribe) and [/streams](/stt/streams).
  </Tip>
</Card>

***

## Using the API

<Steps>
  <Step>
    Review supported audio file requirements [here](/stt/audio).

    <Info>Corti speech to text supports file transcoding; however, it is recommended to follow the outlined best practices for a consistent and reliable experience.</Info>
  </Step>

  <Step>
    Create an Interaction: `POST:/interactions/`

    <Info>Note the `interactionId` included in the response that will be used for aggregating the audio file and transcript assets.</Info>

    <CodeGroup>
      {/* Create an interaction with full encounter and patient details */}

      ```ts title="JavaScript" theme={null}
      // Replace these with your values
      const ASSIGNED_USER_ID = "<uuid-of-your-choosing>";
      const IDENTIFIER = "<id-of-your-choosing>";

      const now = new Date();

      const { interactionId, websocketUrl } = await client.interactions.create({
        assignedUserId: ASSIGNED_USER_ID,
        encounter: {
          identifier: IDENTIFIER,
          status: "planned",
          type: "first_consultation",
          period: {
            startedAt: now,
            endedAt: now,
          },
          title: "Consultation",
        },
        patient: {
          identifier: "<string>",
          name: "<string>",
          gender: "male",
          birthDate: new Date("1990-01-15T00:00:00Z"),
          pronouns: "<string>",
        },
      });
      ```

      {/* Create an interaction with full encounter and patient details */}

      ```csharp title="C# .NET" theme={null}
      // Replace these with your values
      const string ASSIGNED_USER_ID = "<uuid-of-your-choosing>";
      const string IDENTIFIER = "<id-of-your-choosing>";

      var now = DateTime.UtcNow;

      var interaction = await client.Interactions.CreateAsync(new InteractionsCreateRequest
      {
          AssignedUserId = ASSIGNED_USER_ID,
          Encounter = new InteractionsEncounterCreateRequest
          {
              Identifier = IDENTIFIER,
              Status = InteractionsEncounterStatusEnum.Planned,
              Type = InteractionsEncounterTypeEnum.FirstConsultation,
              Period = new InteractionsEncounterPeriod
              {
                  StartedAt = now,
                  EndedAt = now,
              },
              Title = "Consultation",
          },
          Patient = new InteractionsPatient
          {
              Identifier = "<string>",
              Name = "<string>",
              Gender = InteractionsGenderEnum.Male,
              BirthDate = new DateTime(1990, 1, 15, 0, 0, 0, DateTimeKind.Utc),
              Pronouns = "<string>",
          },
      });
      ```

      ```python title="Python" theme={null}
      import requests
      from datetime import datetime, timezone

      # Replace these with your values
      ASSIGNED_USER_ID = "<uuid-of-your-choosing>"
      ENVIRONMENT = "<eu-or-us>"
      IDENTIFIER = "<id-of-your-choosing>"
      TENANT = "<your-tenant-name>"
      TOKEN = "<your-access-token>"

      now = datetime.now(timezone.utc).isoformat()

      response = requests.post(
          f"https://api.{ENVIRONMENT}.corti.app/v2/interactions",
          headers={
              "Authorization": f"Bearer {TOKEN}",
              "Tenant-Name": TENANT,
              "Content-Type": "application/json",
          },
          json={
              "assignedUserId": ASSIGNED_USER_ID,
              "encounter": {
                  "identifier": IDENTIFIER,
                  "status": "planned",
                  "type": "first_consultation",
                  "period": {"startedAt": now, "endedAt": now},
                  "title": "Consultation",
              },
              "patient": {
                  "identifier": "<string>",
                  "name": "<string>",
                  "gender": "male",
                  "birthDate": "1990-01-15T00:00:00Z",
                  "pronouns": "<string>",
              },
          },
      )
      response.raise_for_status()
      interaction = response.json()
      interaction_id = interaction["interactionId"]
      ```

      ```bash title="cURL" theme={null}
      # Replace these with your values
      ASSIGNED_USER_ID="<uuid-of-your-choosing>"
      ENVIRONMENT="<eu-or-us>"
      IDENTIFIER="<id-of-your-choosing>"
      TENANT="<your-tenant-name>"
      TOKEN="<your-access-token>"

      curl -X POST "https://api.${ENVIRONMENT}.corti.app/v2/interactions" \
        -H "Authorization: Bearer ${TOKEN}" \
        -H "Tenant-Name: ${TENANT}" \
        -H "Content-Type: application/json" \
        -d '{
          "assignedUserId": "'"${ASSIGNED_USER_ID}"'",
          "encounter": {
            "identifier": "'"${IDENTIFIER}"'",
            "status": "planned",
            "type": "first_consultation",
            "period": { "startedAt": "2024-01-01T00:00:00Z", "endedAt": "2024-01-01T00:00:00Z" },
            "title": "Consultation"
          },
          "patient": {
            "identifier": "<string>",
            "name": "<string>",
            "gender": "male",
            "birthDate": "2024-01-01T00:00:00Z",
            "pronouns": "<string>"
          }
        }'
      ```
    </CodeGroup>
  </Step>

  <Step>
    Upload an audio file: `POST:/interactions/{id}/recordings/`

    <Info>Note the recordingId that will be used for transcript creation</Info>

    <CodeGroup>
      ```ts title="JavaScript" theme={null}
      import { createReadStream } from "fs";

      // Replace these with your values
      const INTERACTION_ID = "<your-interaction-id>";

      const file = createReadStream("sample.mp3", { autoClose: true });
      await client.recordings.upload(file, INTERACTION_ID);
      ```

      ```csharp title="C# .NET" theme={null}
      // Replace these with your values
      const string INTERACTION_ID = "<your-interaction-id>";

      var file = File.OpenRead("sample.mp3");
      var recording = await client.Recordings.UploadAsync(INTERACTION_ID, file);
      ```

      ```python title="Python" theme={null}
      import requests

      # Replace these with your values
      ENVIRONMENT = "<eu-or-us>"
      INTERACTION_ID = "<your-interaction-id>"
      TENANT = "<your-tenant-name>"
      TOKEN = "<your-access-token>"

      with open("sample.mp3", "rb") as f:
          response = requests.post(
              f"https://api.{ENVIRONMENT}.corti.app/v2/interactions/{INTERACTION_ID}/recordings/",
              headers={
                  "Authorization": f"Bearer {TOKEN}",
                  "Tenant-Name": TENANT,
                  "Content-Type": "application/octet-stream",
              },
              data=f,
          )
      response.raise_for_status()
      recording_id = response.json()["recordingId"]
      ```

      ```bash title="cURL" theme={null}
      # Replace these with your values
      ENVIRONMENT="<eu-or-us>"
      INTERACTION_ID="<your-interaction-id>"
      TENANT="<your-tenant-name>"
      TOKEN="<your-access-token>"

      curl -X POST "https://api.${ENVIRONMENT}.corti.app/v2/interactions/${INTERACTION_ID}/recordings/" \
        -H "Authorization: Bearer ${TOKEN}" \
        -H "Tenant-Name: ${TENANT}" \
        -H "Content-Type: application/octet-stream" \
        --data-binary "@sample.mp3"
      ```
    </CodeGroup>
  </Step>

  <Step>
    Create the transcript: `POST:/interactions/{id}/transcripts/`

    <Info>Each interaction may have more than one audio file and transcript associated with it. Audio files up to 60min in total duration, or 150MB in total size, are supported.</Info>

    <CodeGroup>
      ```ts title="JavaScript" theme={null}
      const { recordingId } = await uploadRecording(client, interactionId, "sample.mp3");

      const transcript = await client.transcripts.create(interactionId, {
          recordingId: recordingId,
          primaryLanguage: "en"
      });
      ```

      ```csharp title="C# .NET" theme={null}
      var recording = await UploadRecordingAsync(client, interactionId, "sample.mp3");

      var transcript = await client.Transcripts.CreateAsync(
          interactionId,
          new TranscriptsCreateRequest
          {
              RecordingId = recording.RecordingId,
              PrimaryLanguage = "en",
          }
      );
      ```

      ```python title="Python" theme={null}
      import requests

      # Replace these with your values
      ENVIRONMENT = "<eu-or-us>"
      INTERACTION_ID = "<your-interaction-id>"
      RECORDING_ID = "<your-recording-id>"
      TENANT = "<your-tenant-name>"
      TOKEN = "<your-access-token>"

      response = requests.post(
          f"https://api.{ENVIRONMENT}.corti.app/v2/interactions/{INTERACTION_ID}/transcripts/",
          headers={
              "Authorization": f"Bearer {TOKEN}",
              "Tenant-Name": TENANT,
              "Content-Type": "application/json",
          },
          json={
              "recordingId": RECORDING_ID,
              "primaryLanguage": "en",
          },
      )
      response.raise_for_status()
      transcript = response.json()
      ```

      ```bash title="cURL" theme={null}
      # Replace these with your values
      ENVIRONMENT="<eu-or-us>"
      INTERACTION_ID="<your-interaction-id>"
      RECORDING_ID="<your-recording-id>"
      TENANT="<your-tenant-name>"
      TOKEN="<your-access-token>"

      curl -X POST "https://api.${ENVIRONMENT}.corti.app/v2/interactions/${INTERACTION_ID}/transcripts/" \
        -H "Authorization: Bearer ${TOKEN}" \
        -H "Tenant-Name: ${TENANT}" \
        -H "Content-Type: application/json" \
        -d '{
          "recordingId": "'"${RECORDING_ID}"'",
          "primaryLanguage": "en"
        }'
      ```
    </CodeGroup>
  </Step>

  <Step>
    Receive the transcript:

    * First, the transcript will process synchronously for a maximum of 25 seconds
    * If the audio file transcription takes longer than the 25 second synchronous processing timeout, then it will continue to process asynchronously.
      * In this scenario, an empty transcript will be returned with a location header that can be used to retrieve the final transcript via the `transcriptId`.
      * The client can poll the Get Transcript endpoint status (`GET /interactions/{id}/transcripts/{transcriptId}/status`) for transcript status (`processing`, `completed`, `failed`).

    <Check>Use the [List Transcripts](/api-reference/transcripts/list-transcripts) endpoint to view all transcripts associated with an interaction, and completed transcripts can be retrieved via the [Get Transcript](/api-reference/transcripts/get-transcript) endpoint.</Check>
  </Step>
</Steps>

***

## Features

<Tip> Click on the cards to learn more...</Tip>

### Languages

<Card href="/stt/languages/">
  Corti speech to text is specifically designed for use in the healthcare domain. A tier system has been introduced to categorize functionality and performance that is available per language and endpoint. Languages in the Enhanced and Premier  tiers have the utmost functionality and recognition accuracy - they're the ones recommended for dictation use.
</Card>

### Audio Configuration

<Card href="/stt/audio/">
  With support for mono or multi-channel audio, with live transcoding and a variety of file formats to choose from, don't let the complexities fo audio capture and processing inhibit opportunities for real-time intelligence. Read more about our recommendations and best practices.
</Card>

### Punctuation

<Card href="/stt/punctuation/">
  Punctuation is essential for coherent documentation. Setting the `isDictation` parameter to true in `/transcripts` requests enables `spokenPunctuation` functionality.
</Card>

### Diarization

<Card href="/stt/diarization/">
  Diarization is the process of segmenting an audio recording by speaker, assigning portions of speech to distinct identities (e.g., “Doctor,” “Patient”). This enables accurate transcription, attribution, and analysis of multi-speaker clinical conversations, but is not required for effective AI scribing or workflow speech-enablement.
</Card>

### Formatting

<Card href="/stt/formatting/">
  `coming soon`<br />Speech to text can be used to create a verbatim transcript of the audio; however, some content is not documented in the same manner as it is verbalized. The `formatting` features provide control over how key information should for represented in the textual output.<br /><sub>*This feature is currently supported on `/transcribe` and `/streams`, but coming soon to `/transcripts`.*</sub>
</Card>

<br />

<Note>
  Please [contact us](https://help.corti.app) for more information or help.
</Note>
