Audio - Append to existing Conversation
PUT {{base_url}}/v1/process/audio/{{conversation_id}}
Append Audio File
The Async Audio API allows you to process an additional audio file to the previous conversation, append the transcription and get conversational insights for updated conversation.
It can be useful in any use case where you have access to multiple audio files of any type of conversation, and you want to extract the insightful items supported by the Conversation API.
The conversationId of the conversation processed from any channels (Realtime, Audio/Video Files or Text Content) are allowed.
Learn More about Async Audio API.
Request Body
The binary payload of a file audio file.
Notice that the content type is
binary, which allows you to select a file you want to upload.
Path Params
Parameter | value |
---|---|
conversationId | conversationId which is provided by the first request submitted using POST async audio API |
Query Params
Parameters | Required | Description |
---|---|---|
name | No | Your meeting name. Default name set to conversationId . |
webhookUrl | No | Webhook url on which job updates to be sent. This should be POST endpoint. |
customVocabulary | No | Contains a list of words and phrases that provide hints to the speech recognition task. |
entities | No | Input custom entities which can be detected in your conversation using Entities' API. For example, check the sample code on right. |
detectPhrases | No | Accepted values are true & false . It shows Actionable Phrases in each sentence of the conversation. These sentences can be found in the Conversation's Messages API. |
enableSeparateRecognitionPerChannel | No | Enables Speaker Separated Channel audio processing. Accepts true or false . |
channelMetadata | No | This object parameter contains two variables speaker and channel to specific which speaker corresponds to which channel. This object only works when enableSeparateRecognitionPerChannel query param is set to true . |
languageCode | No | We accept different languages. Please check language Code as per your requirement. |
Response
In response, conversationId and jobId are returned.
jobId
can be used to get updates on the job status.
conversationId
can be used with the Conversation API to get all the insights, topics and processed messages etc.
Webhook Payload
webhookUrl
will be used to send the status of job created for uploaded audio. Every time the status of the job changes it will be notified on the webhookUrl
.
Parameter | value |
---|---|
jobId | ID to be used with Job API. |
status | Current status of the job. (Valid statuses - [ scheduled, in_progress, completed, failed ]) |
## channelMetadata Object
{
"channelMetadata": [
{
"channel": 1,
"speaker": {
"name": "Robert Bartheon",
"email": "robertbartheon@gmail.com"
}
},
{
"channel": 2,
"speaker": {
"name": "Arya Stark",
"email": "aryastark@gmail.com"
}
}
]
}
channelMetadata
object has the following members:
Field | Description |
---|---|
channel | This denotes the channel number in the audio file. Each channel will contain independent speaker's voice data. |
speaker | This is the wrapper object which defines the speaker for this channel. |
speaker
has the following members:
Field | Description |
---|---|
name | Name of the speaker. |
email | Email address of the speaker. |
Billing for a speaker separated channel audio file happens according to the number of channels present in the audio files. The duration for billing will be calculated according to the below formula:
totalDuration = duration_of_the_audio_file * total_number_of_channels
So if you send a 120-second file with 3 speaker separated channels, the total duration for billing would be 360 seconds or 6 minutes.
Request Params
Key | Datatype | Required | Description |
---|---|---|---|
name | string | Your meeting name. Default name set to conversationId. | |
customVocabulary | string | Contains a list of words and phrases that provide hints to the speech recognition task. | |
confidenceThreshold | string | Minimum required confidence for the insight to be recognized. The range is from 0.0 to 1.0. Default value 0.5.I | |
detectEntities | boolean | If not set to true the Entities API will not return any entities from the conversation . | |
detectPhrases | boolean | It shows Actionable Phrases in each sentence of conversation. These sentences can be found in the Conversation's Messages API. | |
languageCode | string | We accept different languages. They can be found here: https://docs.symbl.ai/docs/async-api/overview/async-api-supported-languages | |
mode | string | 'phone' mode is best for audio that is generated from phone call(which is typically recorded at 8khz sampling rate). | |
'default' mode works best for audio generated from video or online meetings(which is typically recorded at 16khz or more sampling rate). | |||
When you don't pass this parameter default is selected automatically. | |||
trackers | string | A list of key words and/or phrases to be tracked using the Tracker API. |
HEADERS
Key | Datatype | Required | Description |
---|---|---|---|
x-api-key | string |