Business Productivity

Building an on-demand phone call recording solution with Amazon Chime SDK

The Amazon Chime SDK Public Switched Telephone Network (PSTN) audio service to make it easier for developers to build customized telephony applications using the agility and operational simplicity of a serverless AWS Lambda function. Starting today, developers can record voice calls and store the recordings in the Amazon Simple Storage Service (Amazon S3) bucket of their choice using the call recording feature. That means you can now implement call recording in the applications that you build, including voice menu trees, remote call forwarding, and call routing.

In this blog, we will teach you how to implement call recording for two-party phone calls and transcribe the post-call recordings for later analysis. We will build a voice application that helps recruiters be more efficient during phone interviews while securing their personal phone numbers. When a recruiter needs to initiate a phone interview, they first call a special service number and then enter the interviewee’s phone number when prompted. The application then dials the interviewee phone number, informs them that the call is being recorded, then connects the two parties. When the call completes, the conversation is transcribed and then results are emailed to the interviewer. Additionally, the interviewers can share the service number with interviewees so they can call the interviewer back. In both cases, the interviewer’s personal phone number is masked or hidden from the interviewee.

Overview

To build this phone interview service demo, you will implement a Chime SDK based SIP media application using Lambda functions, Amazon Transcribe Call Analytics, and other Amazon Web Services. This hands-on experience will familiarize you with the newly launched Amazon Chime SDK call recording feature and powerful AWS speech analytics capabilities it unlocks.

Diagram of Amazon Chime SDK On Demand Recording

High-level design of the Amazon Chime SDK on-demand call recording and transcription solution.

The AWS elements used in this demo are: Amazon Chime SDK SIP Media Applications (SMA), Amazon Transcribe Call Analytics, Amazon Simple Storage Solution (S3), Amazon DynamoDB, Amazon Lambda, Amazon Simple Notification Service (SNS).

Prerequisites

  • node V12+ installed
  • npm installed
  • yarn installed
  • poetry installed
  • jq installed
  • AWS CLI installed
  • AWS CDK installed
  • npm install -g aws-cdk
    • Be sure to have the latest version installed. If you need to upgrade, uninstall with `npm uninstall -g aws-cdk` and then reinstall.
  • Docker installed
  • Deployment must be done in us-east-1
  • SourcePhone – an E.164 number used as the primary number that will used as the primary phone number
  • EmailSubscription – an email address that will be sent notifications from SNS

Resources Created

  • outboundWav S3 Bucket – used to store Amazon Chime SIP media application’s wav audio files
  • recording S3 Bucket – used for storage of raw recordings, transcriptions, and processed output
  • triggerStepFunction Lambda – Lambda used to start the Step Function that will transcribe and process the audio file
  • smaHandler Lambda – Lambda used by SIP media application to process calls
  • transcribe Lambda – triggered by S3 PutObject in the recordings folder of the recordings bucket. Start
  • CallAnalyticsTranscribe Job
  • createOutput Lambda – Triggered by EventBridge on completion of CallAnalytics Transcribe Job. Creates output file and sends link to SNS
  • SNS Topic – Used to send notification by email of the output file
  • SIP media application resources
    • Phone Number – a number that can be called by SourcePhone number to dial out to PSTN, or by PSTN to dial to SourcePhone
    • SIP rule – a SIP media application rule that will trigger on the dialed number

Demo Code Deployment Instructions

Cloud 9 Deployment is recommend to ensure the correct build of the Docker containers.

git clone https://github.com/aws-samples/amazon-chime-sma-on-demand-recording
cd amazon-chime-sma-on-demand-recording
./deploy.sh

Accept prompts for CDK deployment
When prompted, enter your primary phone number (SourcePhone) and email address.

Cloud9 Deployment Instructions

Expand the size of the EBS volume on your Cloud9 with the script here

nvm install 12
npm install -g aws-cdk
npm install -g yarn
sudo yum install jq
curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/install-poetry.py | python -
./deploy.sh

Demo Walk-through

Upon successfully deploying the CDK components, take note of the phone number in the output of the deployment script. This is the SMA phone number provisioned as described in the Resources Created section; it will be used by the application owner, who has the Interviewer role, as well as other demo user who will play the Interviewee role. When the Interviewer calls the SMA application number from their primary phone number (SourcePhone) provided during deployment, they will hear a prompt to enter the phone number to be called (the Interviewee’s number). Once the target number is entered, your SIP media application will dial that phone number. When the Interviewee answers this call, they will hear a prompt that the call will be recorded, and the demo users will be able to talk. During the conversation, the Interviewer will be able to pause, resume, and stop the call recording using the DTMF commands (pause 5, resume 6, stop 7). When the call ends and both parties have disconnected, a wav audio file with call recording is written to the configured Amazon S3 bucket. This PutObject request starts the process of transcribing the audio and the following steps creating the final output document. The download links to the audio file and the transcription file will be subsequently emailed to the Primary User.

Conversely, if anyone but the primary user dials the SMA application number, a prompt will be played informing the caller that the call is being recorded and then the Primary User is called automatically. Once completed, the same workflow as above will execute, producing a formatted transcript and emailing its download link to the Primary User.

How It Works

This application is comprised of several components that work together without being strictly coupled to each other. The Amazon Chime SIP media application is the main entry point in either direction and will be used for the duration of the call. This SIP media application is controlled by the smaHandler Lambda function using an Invocation and Action response process described here. All of the call routing logic is contained within this Lambda and makes use of the CallAndBridge action to route calls from one user to another. This Lambda also starts the recording process by directing the output to an S3 bucket with the following action:

def start_call_recording(call_id):
  return {
     "Type": "StartCallRecording",
     "Parameters": {
       "CallId": call_id,
       "Track": "BOTH",
       "Destination": {"Type": "S3", "Location": recording_bucket + "/recordings"}}
  }

Once the recording has been delivered to the S3 bucket, a Lambda is triggered from the PutObject event. This Lambda will trigger a Step Function to transcribe and process the transcription.

The transcription uses the Amazon Transcribe Call Analytics feature and is called here:

response = transcribe.start_call_analytics_job(
  CallAnalyticsJobName=call_analyitcs_job_name,
  Media={"MediaFileUri": "s3://" + bucket + "/" + key},
  DataAccessRoleArn=DATA_ACCESS_ROLE,
  OutputLocation="s3://" + bucket + "/transcriptions/" + recording_date + "/" + call_id + ".json",
  Settings={"LanguageOptions": ["en-US"]},
  ChannelDefinitions=[
    {"ChannelId": 0, "ParticipantRole": role_0},
    {"ChannelId": 1, "ParticipantRole": role_1},
  ]
)

Once the transcription is complete, the createOutput Lambda will be invoked by the Step Function to produce a document with the full turn by turn transcription along with other analytics. This Lambda makes use of the application described here and uses an SNS topic to deliver a link to the created output file.

Cleanup

To clean up this demo execute: cdk destroy. The S3 buckets that are created will be emptied and destroyed as part of this destroy so if you wish to keep the files, they should be moved prior to the destruction of the demo solution.

Conclusion

The sample telephone interview solution presented in is blog demonstrates the basic techniques for implementing on-demand call recording functionality in your Amazon Chime SDK SIP Media Applications. It shows how to start, stop, pause, and resume phone call recording programmatically and by the DTMF commands during the call, how to store the recording audio files, as well as precise control of what parts of the call shall be recorded. You have also experienced powerful speech analytics capabilities available through Amazon Transcribe Call Analytics. You can leverage the technology and the development patters showcased in this demo to implement a broad spectrum of solutions for the specialized industry applications, like, public media, insurance, remote healthcare and education, enterprise telephony and collaboration, augment cloud-based or on-premises call centers with interaction recording and quality management, and machine learning enabled speech analytics capabilities.

Please see our github repo for the example code: https://github.com/aws-samples/amazon-chime-sma-on-demand-recording