AWS Cloud Operations & Migrations Blog

Automating custom cost and usage tracking for member account owners in the AWS Migration Acceleration Program

This blog post was contributed by Kanishk Mahajan, AWS and Kalpana Roge, McAfee

The AWS Migration Acceleration Program (MAP) is a cloud migration program that helps enterprises achieve business benefits by migrating existing workloads to Amazon Web Services. MAP provides consulting support, training, and credits on AWS services to reduce risk, build a strong operational foundation and help offset the initial cost of migrations. MAP includes a migration methodology for running legacy migrations in a methodical way and a robust set of tools to automate and accelerate common migration scenarios.

AWS Cost and Usage Reports (AWS CUR) are helpful for customers who want to optimize their AWS spending, allocate costs internally, and drive the right governance model for their organizations. AWS recently announced the CUR Reports for member accounts that enable member accounts in an AWS Organization to set up Cost & Usage Reports containing the specific cost and usage data for just their account.

Our solution extends this functionality by providing automation that enables delivery of CUR reports directly to member accounts as well as it allows the payer account owner to customize the data shared and delivered to the member accounts. Second, these member account reports contain data specific to MAP tagged resources. Prior to this, AWS member account owners who did not have access to the payer account but were enrolled in the AWS Migration Acceleration Program had no direct visibility into their AWS spending on MAP tagged resources. Our solution allows member account owners to continuously track and optimize their spending in MAP directly from their member account. It also provides the payer account owner with control of the MAP related data shared with specific member account owners.

Solution overview

To query CUR reports using Amazon Athena, follow the steps to set up Amazon Athena integration for the payer account. This integration enables the payer account owner to use Amazon Athena, a serverless query service, to analyze the data for MAP tagged resources using standard SQL. This data is delivered in the Amazon Simple Storage Service (Amazon S3) bucket associated with the CUR report.

The Amazon Athena integration also provides the payer account with a setup that uses an AWS Glue crawler to automatically update the Amazon Athena database and tables each time the CUR report is delivered in the payer account.

Our solution works by provisioning an Amazon EventBridge rule in the payer account. The rule is triggered each time the Glue crawler associated with the CUR report successfully completes execution in the payer account. The target of this EventBridge rule is an AWS Lambda function that runs an extraction query in Amazon Athena.

This query extracts data from the CUR report in the payer account. This extracted data is customized to the requirements of what can be shared with the member account. The data is in the format of Apache Parquet files, which are stored in an Amazon S3 output bucket. This S3 bucket has been provisioned with an attached event notification that assigns view permissions to the member account owner for objects in this S3 bucket. These permissions facilitate querying by the member account owner.

Our solution provisions an AWS Glue crawler in the member account that has cross-account access to the output results in the S3 bucket in the payer account. The crawler populates and updates the Amazon Athena database and table in the member account based on the Apache Parquet data extracted in the output bucket. Member account owners can now view and query data for MAP tagged resources from their own accounts.

Figure 1 shows the request flow for our solution.

 Interaction between solution components, including AWS Glue crawler, Amazon EventBridge rule, Lambda function, S3 bucket, Amazon Athena, and AWS CloudFormation.

Solution components

The full solution is available for download and install in our GitHub repo and consists of the following components:

AWS CloudFormation templates:

  • aws-map-payeraccountsetup.yaml
    • Provisions an EventBridge rule in the payer account that is triggered by the successful completion of the AWS Glue crawler. The crawler runs in the payer account each time a MAP CUR report is delivered there and creates/updates the Athena database and table for the MAP CUR report.
    • Provisions an AWS Lambda function as a target of the EventBridge rule. The Lambda function runs a customized Athena query for MAP tagged resources in the member account.
    • Provisions an Amazon S3 output bucket in the payer account with an attached event notification to assign view permissions to objects in the S3 bucket to the member account.
  • aws-map-linkedaccountcrawler.yml
    • Provisions an AWS Glue crawler in the member account that has cross-account access to the Amazon S3 output bucket in the payer account. The crawler creates an Athena database and table in the member account. This enables the member account owner to view and query data for MAP tagged resources specific to their account.

AWS Lambda functions:

  • MAP_athenaextractionquerylambda.py
    • Runs a customized Athena query for MAP tagged resources in the member account. The results of this query are Apache Parquet files that are delivered to the Amazon S3 output bucket.

Walkthrough

This section describes the prerequisites and steps required for you to set up and deploy the solution. It also provides you with a walkthrough of a scenario for our MAP use case.

Prerequisites

  1. To track credits and spending in the MAP program, first set up a CUR report in the payer account that tracks MAP tagged resources.
  2.  Set up Amazon Athena integration with Cost and Usage Reports to enable querying of CUR reports with Athena. To streamline and automate integration of your Cost and Usage Reports with Athena,  AWS provides an AWS CloudFormation template with several key resources along with the reports you setup for Athena integration.  We recommend deploying that template instead of a a manual set up.
  3. Create an S3 bucket with the following name: s3-maplinkedaccountlambdas-<AccountId>-<Region> where the <AccountId> and <Region> are the AWS account ID and the AWS Region of the payer account where the MAP CUR reports are delivered. In this bucket, create a folder named MAP_athenaextractionquerylambda. Place the MAP_athenaextractionquerylambda.zip file in the folder.
  4. Navigate to the Amazon Athena console. In the query editor, enter the following custom MAP extraction query:

CREATE TABLE <Athenadatabasename>.temp_table

WITH (format = 'Parquet',parquet_compression = 'GZIP',      external_location = 's3://<S3Outputbucketname>/<S3Outputbucketfolder>/',      partitioned_by=ARRAY['year_1','month_1'])

AS SELECT *, year as year_1, month as month_1

FROM "<Athenadatabasename>"."<Athenatablename>"where line_item_usage_account_id like <’Memberaccountid>

Substitute the following information in the query:

    • S3Outputbucketname – Enter a name for the S3 bucket that will be created by aws-map-payeraccountsetup.yaml in the payer account.
    • S3Outputbucketfolder – Enter a name for a folder that will be created in the S3 bucket in the payer account by aws-map-payeraccountsetup.yaml.
    • Athenadatabasename – This is the name of the Athena database that was created in the payer account.
    • Athenatablename – This is the name of the Athena table that contains the CUR report data in the Athena database in the payer account.
    • Memberaccountid– This is the 12-digit AWS account ID of the member account.

Choose Save As, and then enter a name and description for the query. Make a note of the query name. You will need this in later steps.

You can add any custom criteria to this extraction query and customize this query according to your business requirements. As an example, we provide another extraction query for a sample scenario later in this post.

You can change this extraction query after the setup. The AWS Glue crawler in our solution just chooses the new extraction query and overwrites the contents of the data to be shared with the member account.

Solution setup

The solution automates the entire setup and deployment in two steps:

Step1: Launch the aws-map-payeraccountsetup.yaml template in the payer account. In the AWS CloudFormation console, create a stack to launch this template. In Parameters, enter the values for the parameters based on their descriptions in the template. The template takes the following parameters:

    • sourcebucket – Name of the S3 bucket that contains AWS Lambda source. Replace <AccountID> and <Region> with the AWS Account ID and Region where you are deploying this template.
    • s3outputbucketname – Name of the S3 bucket that contains the extracted parquet files for the member account.  This is the same as the ‘S3Outputbucketname’ that was used in the MAP extraction query  in the prerequisites step.
    • outputfolder – Sub folder in the S3 output bucket.  This is the same as the ‘S3Outputbucketfolder’ that was used in the MAP extraction query in the prerequisites step.
    • mapmigratedtable – Athena table name for the MAP Reports delivered in the payer account. This table is created by the Athena CUR integration in the prerequisites step.
    • extractionqueryname –  Name of the saved Athena extraction query. This is the MAP extraction query that was created earlier in the prerequisites step.
    • athenaoutputlocation – Athena results location from Athena settings. When you run the query using the Athena console, the Query result location entered under Settings in the navigation bar determines the client-side setting.
    • payergluecrawlername – Name of the Glue crawler that creates MAP reports in payer account. This crawler is created by the Athena CUR integration in the prerequisites step.
    • linkedaccountid – AWS Account ID of the member account.
    • canonicalidpayer- Canonical user ID of the payer account. This is an alpha-numeric identifier and it is an obfuscated form of the AWS account ID. You use this ID to identify an AWS account when granting cross-account access to buckets and objects using Amazon S3. You can retrieve the canonical user ID for your AWS account as either the root user or an IAM user.
    • canonicalidlinked- Canonical user ID of the member account.

The aws-map-payeraccountsetup.yaml template creates an Amazon S3 bucket with the following name: s3-<s3outputbucketname>-<AccountId>-<Region> where <s3outputbucketname> is the value of the <s3outputbucketname> parameter that you supplied when launching the template and the <AccountId> is your account ID and <Region> is the AWS Region where you have deployed this template. In this bucket, create a folder with the same name as the value of the <outputfolder> parameter that you entered above. 

Step2: Launch the aws-map-linkedaccountcrawler.yml template in the member account. In the AWS CloudFormation console, create a stack to launch this template. In Parameters, enter the values for the parameters based on their descriptions in the template.

Query your MAP data in the member account

As a member account owner, you can now run queries directly from your member account on MAP tagged resources without access to the payer account. In our solution, the data in the member account is updated daily. You can configure the frequency in the parameters in the aws-map-linkedaccountcrawler.yml template. The payer account has control of the data elements that are shared from the MAP CUR report with the member accounts.

Scenario

Let’s walk through a business scenario for our MAP use case. Here are our requirements:

  1. The payer account is only required to share product and pricing data for MAP tagged resources and for a configurable set of member accounts in a business unit.
  2. The business unit has delegated a member account where CUR MAP data can be queried and analyzed. The owners of this delegated member account query and analyze data for all other member accounts in the business unit.
  3. The delegated member account owners in the business unit do not have access to the payer account.

This is our extraction query:

CREATE TABLE <Athenadatabasename>.temp_table

WITH (

      format = 'Parquet',

      parquet_compression = 'GZIP',

      external_location = 's3://<S3Outputbucketname>/<S3Outputbucketfolder>/',

      partitioned_by=ARRAY['year_1','month_1'])

AS SELECT product_product_name,product_instance_type,product_region,product_product_family,product_servicecode, line_item_blended_cost, discount_bundled_discount,pricing_public_on_demand_cost,pricing_term, resource_tags_user_map_migrated AS MAPTAG, year as year_1, month as month_1 FROM "<Athenadatabasename>"."<Athenatablename>"

where line_item_usage_account_id like <’memberaccountid1’>, <memberaccountid2>

To recap, our solution works by provisioning an Amazon EventBridge rule in the payer account. The rule is triggered each time the Glue crawler associated with the CUR report successfully completes execution and updates the Athena database and tables in the payer account. The target of this EventBridge rule is an AWS Lambda function that runs an extraction query in Amazon Athena.

So, in this case the Lambda function simply runs this extraction query the next time that the CUR report gets delivered in the payer account and overwrites the contents of the data to be shared with the member account.

Next, we sign in to our delegated member account in Amazon Athena and use the query editor to query the data. We query for MAP-related spending across all products in the member accounts from our delegated account:

SELECT resource_tags_user_map_migrated AS MAPTAGS,

     SUM(line_item_blended_cost + discount_bundled_discount) AS MAPSPEND

FROM "<Athenadatabasename>"."<Athenatablename>"

WHERE

    MONTH = CAST(MONTH(CURRENT_DATE) AS varchar(4))

    AND YEAR = CAST(YEAR(CURRENT_DATE) AS varchar(4))

    AND line_item_usage_account_id like '<memberaccountid1>', ‘<memberaccountid2>’

GROUP BY resource_tags_user_map_migrated, pricing_public_on_demand_cost

Conclusion

In this blog post, we have described a solution that automates delivery of Cost and Usage Reports for MAP tagged resources directly to member accounts. It also allows the payer account owner to customize and share specific data in the cost and usage reports with each member account. The solution thus provides direct visibility and transparency to AWS member account owners into their AWS spending on MAP tagged resources without requiring them to have access to the payer account.  We hope that you find this solution helpful and we welcome your comments and feedback.

About the Authors

Kanishk is an ISV Solutions Architecture Lead at AWS. In this role, he leads cloud transformation and solution architecture for our Independent Software Vendor partners and mutual customers in all areas that relate to management and governance, security and compliance, and migrations and modernizations in AWS.

 

 

 

Kalpana Roge is a Sr Cloud Engineer at McAfee. Kalpana specializes in automating the foundation layer for AWS as well as automations for cost financial management, charge backs and cost optimizations across public clouds.