Skip to main content

AsrInput

Properties

NameTypeRequiredDescription
content_uristrPublicly facing uri
encodingAsrInputEncodingThe encoding of the original audio
language_codestrLanguage spoken in the audio file.
sourcestrSource of the audio file eg: Phone, RingCentral, GoogleMeet, Zoom etc
audio_typeAsrInputAudioTypeType of the audio
separate_speaker_per_channelboolIndicates that the input audio is multi-channel and each channel has a separate speaker.
speaker_countintNumber of speakers in the file, omit parameter if unknown
speaker_idsList[str]Optional set of speakers to be identified from the call.
enable_voice_activity_detectionboolApply voice activity detection.
enable_punctuationboolEnables Smart Punctuation API.
enable_speaker_diarizationboolTags each word corresponding to the speaker.
speech_contextsList[SpeechContextPhrasesInput]Indicates the words/phrases that will be used for boosting the transcript. This can help to boost accuracy for cases like Person Names, Company names etc.

AsrInputEncoding

The encoding of the original audio

Properties

NameTypeRequiredDescription
MPEGstr"Mpeg"
MP4str"Mp4"
WAVstr"Wav"
WEBMstr"Webm"
WEBPstr"Webp"
AACstr"Aac"
AVIstr"Avi"
OGGstr"Ogg"

AsrInputAudioType

Type of the audio

Properties

NameTypeRequiredDescription
CALLCENTERstr"CallCenter"
MEETINGstr"Meeting"
EARNINGSCALLSstr"EarningsCalls"
INTERVIEWstr"Interview"
PRESSCONFERENCEstr"PressConference"
VOICEMAILstr"Voicemail"

Build Your Own SDKs with  liblab

Build developer friendly SDKs in minutes from your APIs

Start for Free →