Using CloudFront origin groups to increase availability on SPA deployments

Dec 13, 2021 aws cloudfront lambda lambdaedge spa single page application s3

Overview
Adding high-availability
Honorable mentions
Conclusion

Overview

Adding automated failover for your SPA deployment that is deployed to an associated AWS region is a simple, cost-effective way to increase site availability! In this post, we’ll cover the sometimes forgotten parts of Amazon CloudFront, Lambda@Edge (and purpose-driven functionality) along with Amazon S3 as native origin’s within an origin group.

Services utilized

In this post, we’ll cover a few different AWS services. This includes:

Amazon S3
Amazon CloudFront
Lambda@Edge functions
AWS CodePipeline (or preferred deployment to output artifacts into Amazon S3)

Existing deployment

An assumption made is that there is currently a SPA (single page application) deployment in place. An example is listed here, and we can add onto that:

spa app

Existing deployment availability

As you can see above, the deployment is inclusive of a single region. If there is a failure in that region (for example, us-east-1), then the static site contents will not be able to be served. The only exception for that would be cached content still residing at the CloudFront Edge location.

How do we ensure that our site contents are still served even though the region is down?

Adding high-availability

The following AWS components help build high-availability for Amazon S3 type origins:

Amazon CloudFront Origin Group
Amazon S3 Bucket w/ #4 OAI associated with the bucket policy (deployed in a non-like region)
Amazon CloudFront Origin (Amazon S3 type, deployed in a non-like region)
Amazon CloudFront OAI (Origin Access Identity for the new Origin)
Setting HTTP response codes for the origin group to failover (for example, 5xx errors)

This is detailed here, in the AWS documentation - we’ll focus on a few area’s that weren’t necessarily specified in the documentation.

Amazon S3 bucket (cross region)

Ensure versioning is enabled on the bucket, as it is a dependency for Amazon S3 replication.

OAI - Origin Access Identity

A lesson learned, use ONE unique OAI per Amazon S3 bucket policy. You cannot share an OAI cross-region (for example, the one you created for your static site content in the primary region). This will cause issues if the OAI is shared, specifically with Lambda@Edge functions.

oai

Don’t let DNS be your dependency in HA design

Amazon CloudFront Origin Groups allow for automatic failover based on origin-response HTTP response codes, but always be mindful of DNS. If Amazon Route 53 is utilized for hosting of the DNS forward zone, be mindful that you’ll be able to access those records on a region failover. For example, if the ability to modify DNS records in a Amazon Route 53 forward zone is impacted during a region outage - what good is a complex cross-region design?

Utilizing a 3rd party DNS provider ensures if their is a service impact to your cloud provider platform, your ability to failover shouldn’t be impacted.

Lambda@Edge can be mighty slow to de-replicate

An Amazon Lambda@Edge function associated with a CloudFront distribution’s behavior is replicated to multiple edge locations. This also means that when that function is disassociated with the behavior, it can take time to dereplicate (somewhere between 1hr to up to 24hrs).

Why do I mention Lambda@Edge replication times? The reason being if you utilize CloudFormation or AWS CDK (Cloud Development Kit), the stack will likely fail to delete the function as it takes time to deregister the function from the behavior.

Unfortunately this issue is associated with other Infrastructure as Code provisioner’s, such as Terraform as shown here.

What’s the fix/resolution? There doesn’t seem to be a generally accepted way to handle this quite yet. In my case, if I know I need to drop the stack, I’ll disassociate it with the behavior a few hours prior. I suppose akin to pre-warming the oven?

Honorable mentions

Handling index.html redirection

In most cases, you’ll need to define your default root object in Amazon CloudFront (generally being index.html). In addition to this, you may need an Amazon CloudFront Function associated with the VIEWER_REQUEST assigned behavior that rewrites the URI to include index.html.

function handler(event) {
    var request = event.request;
    var uri = request.uri;
    
    // Check whether the URI is missing a file name.
    if (uri.endsWith('/')) {
        request.uri += 'index.html';
    } 
    // Check whether the URI is missing a file extension.
    else if (!uri.includes('.')) {
        request.uri += '/index.html';
    }

    return request;
}

This is documented in detail over here on the AWS documentation page.

Redirects

Moving from another platform (F5 Load Balancers + NGINX or IIS)? There is a likelihood you utilized NGINX rewrite rules or an IIS URL rewrite module. What does this equate to in a SPA deployment on Amazon CloudFront + S3 Origins?

Well, that depends.. (on how far you want to fall down the rabbit hole). After spending multiple hours on this, I wanted to curate a list of some of the best approaches I’ve found:

Security

You can utilize Amazon CloudFront response headers to enhance security postering, as blogged about previously.

Conclusion

There is a few (smaller) pitfalls for handling high-availability for SPA deployments in AWS, however they are generally easy to spot and hopefully this article helps call-out a few as well. Cheers!