Musical stems
Isolate vocals, bass, drums, guitars, strings, piano, keys, and wind from any audio file using AI algorithms.
DetailsYour toolbox for creative control
Our Developer Platform lays out Modules as fundamental building blocks, and each component can be used to build your application. Sorted into categories based on their functionalities, the Modules condense advanced technology into units of musical potency.
The individual music source separation components that can be used to build your application.
Isolate vocals, bass, drums, guitars, strings, piano, keys, and wind from any audio file using AI algorithms.
DetailsIsolate vocals, effects, and music stems from any audio file, tailored for film and video industry post-production.
DetailsIsolate lead and backing vocals from any audio file with high precision.
DetailsIsolate rhythm and solo guitar parts from any audio file.
DetailsIsolate individual drum sounds including kick drums, snares, toms, hi-hats, and cymbals from any drum audio file.
DetailsIsolate the instrumental stem from any audio file.
DetailsSeparate acoustic and electric guitars from any audio file with the remaining audio elements contained in a separate output.
DetailsThe individual AI transcription components that can be used to build your application.
Transcribe beats and calculate the tempo (beats per minute) from an input audio file, providing you with useful information for reproducing the rhythm of the track.
DetailsTranscribe chords and root key from audio, providing a timeline of chord annotations in different classes (e.g., Complex Jazz, Simple Jazz, Complex Pop and Simple Pop) and bass detection.
DetailsSegment audio files into sections with annotated labels.
DetailsTranscribe and align Lyrics from any audio file, converting sung content into textual form. This model is optimized for transcribing singing. For speech transcription, please use the Speech Transcription module.
DetailsTranscribe and align Speech from any audio file. Converting spoken content into textual form. This model is optimized for speech transcription. For singing/lyrics transcription, please use the Lyrics Transcription module.
DetailsTranslate transcriptions (speech or lyrics) to the desired language.
DetailsThe individual audio mixing component that can be used to build your application.
Mix a collection of audio channels, customizing individual channel volumes and phase inversion.
DetailsGenerate videos mixed with audio and subtitles.
DetailsAI-powered multitrack mixing: Work with up to 8 channels and achieve a professional, balanced mix.
DetailsThe individual audio mastering component that can be used to build your application.
Master the audio using another reference mastering allowing customization of bit depth, normalization, and limiting.
DetailsGenerate a mastered version of input audio using advanced AI algorithms.
DetailsSpacial mastering of stereo refrences and stems, powered by Masterchannel.
DetailsThe individual metronome generation component that can be used to build your application.
Generate metronome audio based on the input beat map, allowing customization of speed and choice of metronome sound.
DetailsConvert text to speech with the option to translate to another language. It includes two demo voices (Male and Female). To train your own voice model, please contact sales.
DetailsThe individual encoding components that can be used to build your application.
Convert and encode the input audio file to a different format, bit rate, sample rate, and number of channels, supporting various common audio codecs and formats.
DetailsConvert JSON to CSV format
DetailsConvert and encode the input subtitle file to a different format, supporting various common subtitle formats.
DetailsThe individual effects components that can be used to build your application.
Apply a dynamic range limiter to control and limit audio signal loudness (iLUFS) while preserving dynamic range.
DetailsApply a limiter to selectively control signal peaks while preserving sound quality.
DetailsNormalize audio adjusting peak volume levels to the specified decibel (dB) value, while maintaining the dynamics and balance of the original audio.
DetailsApply the Overdrive effect to amplify audio, adding a rich, powerful, and slightly distorted tone.
DetailsAdjust the pitch of audio, shifting up or down by a specified number of cents.
DetailsAdd the reverb effect to an audio file, simulating the sound reflections and reverberations within a room, hall or space while allowing customization of various parameters.
DetailsReverse the audio file so that it plays backward from the end to the beginning without affecting other audio properties.
DetailsAdjust the playback speed of the audio, using a customizable factor ranging from 0 to 10. This will modify both the audio's tempo and pitch. If you prefer to maintain the original pitch, consider using the Tempo effect instead.
DetailsAdjust the tempo of the audio file without affecting pitch, using a customizable factor ranging from 0 to 10.
DetailsThe individual music utilities components that can be used to build your application.
A static input that will use a single file for processing. A helpful ingredient when reference mastering.
DetailsExtract a specified segment from an audio file, retaining chosen duration starting from a designated point in time.
DetailsExtract segments from an audio file based on a segment array with 'start' and 'end' properties
DetailsCompute pitch shift suggestions based on vocal range map and input audio target.
DetailsMatch audio containing speech or singing with corresponding subtitle lines or words, providing word-by-word and line-by-line aligned data in JSON format.
DetailsGenerate a timeline marking periods of audio activity.
DetailsAdd silence to the beginning and/or end of your audio file to create space before and after the audio content.
DetailsThe individual classification components that can be used to build your application.
Compute and return vocal pitch from audio(s), providing a pitch map for each MIDI key from 0 to 127.
DetailsClassify audio segments for simplified metadata management and content discovery.
DetailsClassify AI-generated audio content, particularly focusing on AI-Generated vocals.
DetailsClassify AI-generated audio content, particularly focusing on AI-Generated vocals.
DetailsExtract media file metadata including duration, sample rate, channel count, bit depth, and codec.
DetailsProvide deep insights into music tracks for advanced categorization and analysis, powered by Cyanite.ai.
DetailsDetect music presence in audio files.
DetailsThe individual enhancement components that can be used to build your application.
This module processes voice recordings for both speech or singing, and upsamples them to 48 KHz.
DetailsRemove background noise from voice recordings. It is useful for cleaning up recordings made in noisy environments, such as a busy street or a crowded restaurant.
DetailsThe individual style transfer components that can be used to build your application.
Provide comprehensive metrics for audio source separation quality including SDR, SI-SDR, SNR, and SI-SNR.
DetailsTransfer the timbre from one voice to another. Currently supporting two demo voices. To train your own voice transfer model (Private model ID), please contact sales.
DetailsThe individual input/output components that can be used to build your application.
Start now — or reach out for assistance.