Introduction
The Transcription Workflow is defined by processing a complete audio file to return a text document. In scenarios where real-time speech-to-text is not required or feasible, the transcription workflow provides functional and cost effective means for creating verbatim, conversational or dictation-style transcripts.Endpoints and capabilities
| Endpoint | Capability | Use |
|---|---|---|
| Interactions | The foundational unit that ties together all related data and operations, enabling a cohesive workflow. | Required |
| Recordings | Upload audio file(s) that can be used for transcript generation. | Required |
| Transcripts | Generate transcripts for audio files that are associated with the interaction. | Required |
Workflow
1
Create interaction
- The workflow begins with the client initiating an interaction by sending a
POSTrequest to the/interactionsendpoint. - The API responds with a unique
idfor the interaction and a WebSocket URL (wssUrl). The identifier will be used to manage the subsequent steps of the workflow. The WebSocket URL will not be required for this workflow.
2
Upload audio recording
- Once the interaction is initialized, the client uploads an audio file associated with that interaction by sending a
POSTrequest to/interactions/:id/recording. - The API responds with a
200status and returns arecordingId, confirming that the audio file has been successfully uploaded and linked to the interaction.
3
Create transcript
- After the recording is uploaded, the client initiates the transcription process by sending a
POSTrequest to/interactions/:id/transcripts. - The API processes the audio and returns a
200status with the generated transcript. This transcript contains the text version of the recorded interaction, extracted and formatted for review.
See details on transcription configuration options here