Skip to main content

AsrInput

Properties

Name	Type	Required	Description
content_uri	str	❌	Publicly facing uri
encoding	AsrInputEncoding	❌	The encoding of the original audio
language_code	str	❌	Language spoken in the audio file.
source	str	❌	Source of the audio file eg: Phone, RingCentral, GoogleMeet, Zoom etc
audio_type	AsrInputAudioType	❌	Type of the audio
separate_speaker_per_channel	bool	❌	Indicates that the input audio is multi-channel and each channel has a separate speaker.
speaker_count	int	❌	Number of speakers in the file, omit parameter if unknown
speaker_ids	List[str]	❌	Optional set of speakers to be identified from the call.
enable_voice_activity_detection	bool	❌	Apply voice activity detection.
enable_punctuation	bool	❌	Enables Smart Punctuation API.
enable_speaker_diarization	bool	❌	Tags each word corresponding to the speaker.
speech_contexts	List[SpeechContextPhrasesInput]	❌	Indicates the words/phrases that will be used for boosting the transcript. This can help to boost accuracy for cases like Person Names, Company names etc.

AsrInputEncoding

The encoding of the original audio

Properties

Name	Type	Required	Description
MPEG	str	✅	"Mpeg"
MP4	str	✅	"Mp4"
WAV	str	✅	"Wav"
WEBM	str	✅	"Webm"
WEBP	str	✅	"Webp"
AAC	str	✅	"Aac"
AVI	str	✅	"Avi"
OGG	str	✅	"Ogg"

AsrInputAudioType

Type of the audio

Properties

Name	Type	Required	Description
CALLCENTER	str	✅	"CallCenter"
MEETING	str	✅	"Meeting"
EARNINGSCALLS	str	✅	"EarningsCalls"
INTERVIEW	str	✅	"Interview"
PRESSCONFERENCE	str	✅	"PressConference"
VOICEMAIL	str	✅	"Voicemail"

Build Your Own SDKs with liblab

Build developer friendly SDKs in minutes from your APIs

Start for Free →