Speech

Transcribe and align speech from any audio file, converting spoken content into textual form. This model is optimized for speech transcription. For singing/lyrics transcription, please use the Lyrics Transcription module.

Documentation

Settings

  • Name
    language
    Type
    string
    Description

    Expected spoken language of the audio. Defaults to Auto-detection.

  • Name
    alignment
    Type
    string
    Description

    The alignment that should be applied.

  • Name
    diarization
    Type
    boolean
    Description

    Apply speaker identification for each subtitle sentence based on input audio.

Input

  • Name
    inputFileUrl
    Type
    string
    Description

    Audio to process.

Output

  • Name
    alignment
    Type
    string
    Description

    Transcribed and aligned speech.

Ready to innovate your business?