Welcome to part 3 of the tutorial series on AWS Audio Analysis. In the previous tutorial, I have taken you through on resource creation as per the architecture diagram. Here, in this tutorial, I’m going to take you through on configuring the lambda function for transcription.

Go ahead and open the lambda function that we have created in the previous tutorial (i.e. aws-audio-analysis). Copy and paste the below code within that lambda function and save it.

-*- coding: utf-8 -*-
AWS Lambda
Contributor: Chirag Rathod (Srce Cde)

import json
import boto3

transcribe = boto3.client('transcribe')

def lambda_handler(event, context):
    if event:
        file_obj = event["Records"][0]
        bucket_name = str(file_obj['s3']['bucket']['name'])
        file_name = str(file_obj['s3']['object']['key'])
        s3_uri = create_uri(bucket_name, file_name)
        job_name = context.aws_request_id
        transcribe.start_transcription_job(TranscriptionJobName = job_name,
		                                   Media = {'MediaFileUri': s3_uri},
		                                   MediaFormat =  'mp3',
		                                   LanguageCode = "en-US",
                                           OutputBucketName = "bucket-name",
                                            # 'VocabularyName': 'string',
                                            'ShowSpeakerLabels': True,
                                            'MaxSpeakerLabels': 2,
                                            'ChannelIdentification': False

    return {
        'statusCode': 200,
        'body': json.dumps('Transcription job created!')

def create_uri(bucket_name, file_name):
    return "s3://"+bucket_name+"/"+file_name

Apart from that, go ahead and increase the lambda timeout to 5 minutes from default 3 seconds and save the lambda function. Ideally, the 5 minutes might not be enough for longer audio files.

Now, navigate to the S3 management console and open aws-audio-analysis (in my case) S3 bucket and drop an audio file to test the lambda function. Once the file is uploaded, you can go ahead and check the CloudWatch logs of the lambda function to track the progress of the transcription job.


Here, we have a limitation in terms of a lambda function. In terms of the lambda function, if we drop the larger audion file (length/duration) them lambda function will timeout before Amazon Transcribe is able to finish the job. Hence, it fails. Well, I will discuss in the upcoming tutorial on how we can overcome this limitation.

I have explained each line of the code at a high level within the video tutorial. Please refer the mentioned video tutorial for an in-depth explanation.

In the next tutorial, I will take you through on the problem with our current architecture. While I post the new update on AWS Audio Analysis, refer my YouTube channel for more tutorials. Keep sharing and stay tuned for more. Follow me on Twitter