Audio - Append to existing Conversation

PUT {{base_url}}/v1/process/audio/{{conversation_id}}

Append Audio File

The Async Audio API allows you to process an additional audio file to the previous conversation, append the transcription and get conversational insights for updated conversation.

It can be useful in any use case where you have access to multiple audio files of any type of conversation, and you want to extract the insightful items supported by the Conversation API.

The conversationId of the conversation processed from any channels (Realtime, Audio/Video Files or Text Content) are allowed.

Learn More about Async Audio API.

Request Body

The binary payload of a file audio file.

Notice that the content type is binary, which allows you to select a file you want to upload.

Path Params

Parametervalue
conversationIdconversationId which is provided by the first request submitted using POST async audio API

Query Params

ParametersRequiredDescription
nameNoYour meeting name. Default name set to conversationId.
webhookUrlNoWebhook url on which job updates to be sent. This should be POST endpoint.
customVocabularyNoContains a list of words and phrases that provide hints to the speech recognition task.
entitiesNoInput custom entities which can be detected in your conversation using Entities' API. For example, check the sample code on right.
detectPhrasesNoAccepted values are true & false. It shows Actionable Phrases in each sentence of the conversation. These sentences can be found in the Conversation's Messages API.
enableSeparateRecognitionPerChannelNoEnables Speaker Separated Channel audio processing. Accepts true or false.
channelMetadataNoThis object parameter contains two variables speaker and channel to specific which speaker corresponds to which channel. This object only works when enableSeparateRecognitionPerChannel query param is set to true.
languageCodeNoWe accept different languages. Please check language Code as per your requirement.

Response

In response, conversationId and jobId are returned.

jobId can be used to get updates on the job status.

conversationId can be used with the Conversation API to get all the insights, topics and processed messages etc.

Webhook Payload

webhookUrl will be used to send the status of job created for uploaded audio. Every time the status of the job changes it will be notified on the webhookUrl.

Parametervalue
jobIdID to be used with Job API.
statusCurrent status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ])

## channelMetadata Object

 {
   "channelMetadata": [
     {
       "channel": 1,
       "speaker": {
         "name": "Robert Bartheon",
         "email": "robertbartheon@gmail.com"
       }
     },
     {
       "channel": 2,
       "speaker": {
         "name": "Arya Stark",
         "email": "aryastark@gmail.com"
       }
     }
   ]
 }

channelMetadata object has the following members:

FieldDescription
channelThis denotes the channel number in the audio file. Each channel will contain independent speaker's voice data.
speakerThis is the wrapper object which defines the speaker for this channel.

speaker has the following members:

FieldDescription
nameName of the speaker.
emailEmail address of the speaker.

Billing for a speaker separated channel audio file happens according to the number of channels present in the audio files. The duration for billing will be calculated according to the below formula:

totalDuration = duration_of_the_audio_file * total_number_of_channels

So if you send a 120-second file with 3 speaker separated channels, the total duration for billing would be 360 seconds or 6 minutes.

Request Params

KeyDatatypeRequiredDescription
namestringYour meeting name. Default name set to conversationId.
customVocabularystringContains a list of words and phrases that provide hints to the speech recognition task.
confidenceThresholdstringMinimum required confidence for the insight to be recognized. The range is from 0.0 to 1.0. Default value 0.5.I
detectEntitiesbooleanIf not set to true the Entities API will not return any entities from the conversation .
detectPhrasesbooleanIt shows Actionable Phrases in each sentence of conversation. These sentences can be found in the Conversation's Messages API.
languageCodestringWe accept different languages. They can be found here: https://docs.symbl.ai/docs/async-api/overview/async-api-supported-languages
modestring'phone' mode is best for audio that is generated from phone call(which is typically recorded at 8khz sampling rate).
'default' mode works best for audio generated from video or online meetings(which is typically recorded at 16khz or more sampling rate).
When you don't pass this parameter default is selected automatically.
trackersstringA list of key words and/or phrases to be tracked using the Tracker API.

HEADERS

KeyDatatypeRequiredDescription
x-api-keystring