AWS Storage Blog

Recovering from a disaster using AWS Storage Gateway and Amazon S3 Glacier

In the 1960s, a steel box changed the world. The welded steel cargo container may be one of the dullest inventions in history, but you are surrounded by its benefits each and every day. Before the 1960s, items were individually loaded onto a cargo ship and removed when the ship arrived at its destination. This was a labor-intensive operation that exposed the goods to damage or theft.

The introduction of the standard-sized cargo container allowed both heavy and light goods to be packed securely and safely. It reduced the amount of labor required throughout the shipping process, thereby lowering costs dramatically. Lower costs, efficient packaging, and faster shipping expanded the choice of products consumers could buy locally.

The original shipping container for data was the magnetic tape spool, and it was introduced a decade earlier than the cargo container. Developed by the Eckert-Mauchly Computer Corporation and introduced by Remington Rand, the UNISERVO I tape drive acted as an input-output device for the UNIVAC I computer. Spools of tape replaced paper punched cards as information storage media. Over time, as storage density increased, the spool shrank to become a small plastic cartridge. As the density and form factor of tape changed, the storage performance hierarchy also changed. Tape started as primary storage, and has over time become the storage of last resort.

In this blog, I discuss disaster recovery (DR) planning and the three key measurements by which a DR plan should be evaluated. I then focus on the advantages of Tape Gateway with Amazon S3 Glacier and S3 Glacier Deep Archive. Having examined long-term data retention, archiving, and recovery of virtual tapes, I finish by examining cross Tape Gateway data restoration.

Overview

A last resort makes us think of circumstances you plan for, but you hope will not have to live through. In the context of DR, it is the expectation that at some stage you will lose all or part of your data center capability. Having planned for this loss you should have the critical compute, storage, and software available elsewhere to support data processing and a resumption of business operations. A DR process can involve downtime, as to how much downtime, that depends on the type of DR solution you have deployed to swing into action.

In DR planning, there are three measurements by which a DR plan should be evaluated. The first is the Recovery Point Objective (RPO), the maximum acceptable time before a disaster in which changes or data might be lost as a result of a recovery operation. In some cases, that can be measured in days or weeks, for other applications, the closer to zero seconds of data loss, the better. The second is the Recovery Time Objective (RTO), the maximum acceptable time to bring applications, data, or systems back to an operational state. Finally, the third is cost, and cost dictates that there are different types of solutions to satisfy different Recovery Point and Recovery Time objectives.

AWS solutions

AWS offers several solutions to recover from an on-premises data center disaster in the cloud. These solutions range from online disk-to-disk asynchronous replication solutions such as CloudEndure Disaster Recovery to services that enable recovery of long-term data stored in Amazon S3 Glacier and Amazon S3 Glacier Deep Archive. Disk and object-based storage systems are the target of choice for operational backup recovery; that is, data that may need to be restored a few hours or days after it was first backed up. Not all data that is backed up or archived is required for an operational restore, but a significant amount of data under multi-year retention is still held on tape cartridges.

Tape Gateway, a member of the AWS Storage Gateway service, is a virtual tape library (VTL) that emulates a tape medium changer, tape drives, and tape cartridges. The difference between Tape Gateway and previous on-premises virtual tape libraries is that, unlike VTLs from some other vendors, the bulk of the storage for Tape Gateway is provided by Amazon S3.

Tape Gateway and Amazon S3 Glacier advantages

You can deploy Tape Gateway on a virtual machine (VM) or a hardware appliance and provide your backup and/or archiving applications with a portal. This can be done into the three Amazon S3 storage classes used by Tape Gateway: S3 Standard, S3 Glacier, and S3 Glacier Deep Archive.

How backup works with Tape Gateway

S3 Standard is widely supported by backup and data archiving applications today. Some application developers have added additional S3 storage classes as target destinations through the delivery of product updates to their customer base. For those on application versions, which do not support S3 Glacier and S3 Glacier Deep Archive, a tape storage interface is the easiest way to access these long-term retention storage classes.

As per standard on-premises VTL practice, virtual tapes can be created automatically by a Tape Gateway, but with Tape Gateway you are only charged for data written to those virtual tapes. When a tape is ejected from the VTL, it is archived (moved) to a predefined long-term retention pool. You can archive virtual tapes to a GLACIER pool for data or a DEEP ARCHIVE pool. Retrieval of data in S3 Glacier typically takes 3–5 hours, and retrieval of data in S3 Glacier Deep Archive typically takes 5–12 hours.

The choice between S3 Glacier and S3 Glacier Deep Archive is one of access time and economics. The longer you can wait to access the data, the lower the storage cost per gigabyte you are charged to retain it. Recovery Time Objectives and any organizational service level agreements drive the decision to use one pool type over the other.

There are several advantages to using S3 Glacier and S3 Glacier Deep Archive instead of using physical tape infrastructure for long-term data retention. First, copies of your data in AWS are being stored across multiple Availability Zones in a Region. Second, rich security controls and data encryption restrict access to authorized users and applications. Third, WORM support enforces data retention policies.

These are in addition to the savings generated by eliminating the manual handling and warehousing overhead of managing physical tape cartridges. Also, unlike a tape cartridge sitting on a shelf in a warehouse, AWS provides 11 9’s of data durability. In addition, fixity checks are carried out on the data by AWS to ensure that no errors have been introduced.

Virtual tape archiving and data restoration with Amazon S3

Once a virtual tape has been archived, for all intents and purposes, the backup application assumes that the virtual tape is sitting on a shelf outside of the VTL. As mentioned earlier, the reality is the virtual tape exists as an archive object stored in Amazon S3. The fact that this virtual tape is available for restoration from S3 Glacier or S3 Glacier Deep Archive storage means that a virtual tape can be restored via a different Tape Gateway.

This is a useful function as it leverages the accessibility of the service-managed storage to make data stored for long-term retention available at different locations. If you have a failure on premises, be it of hardware or connectivity, it is possible to restore data archived in S3 Glacier or S3 Deep Archive through a different Tape Gateway. This can save your business during a disaster event, but it should be noted that it may be ill suited to act as a data migration tool between locations during normal business operations.

To make virtual tapes available to a different Tape Gateway the original gateway must be disabled. If you have suffered a critical event in a data center, disabling a gateway that is no longer running or connected will not be an issue. However, if you have ongoing backup and restore jobs that depend on the original gateway, this may be a concern.

Notwithstanding the unavailability of the original gateway during the restoration operation, it is possible with adequate scheduling to restore data from the gateway archival storage to a physical location that differs from where the data originated. Just ensure that any backup or restore operations are not required during the restoration of long-term data to the new location.

Conclusion

In this blog post, I showed you how you can use Tape Gateway for DR using your long term data archived in AWS. I discussed the important considerations of RTO, RPO, and cost when it comes to DR planning, in addition to the suitability of Amazon S3 Glacier and S3 Glacier Deep Archive for long-term data retention.

Disasters happen more frequently than we expect, but with a tested DR plan and the right set of services, customers can resume normal operations with full speed when a disaster strikes. The rudiments of a DR plan take shape inside your organization. The fundamentals of secure, durable, and accessible storage services are an AWS core competency.

With the proper combination of a tested plan and reliable services, your IT infrastructure can weather the harshest storm. To learn more about the wide range of backup and archive storage solutions that AWS offers, visit the backup and restore page and the data archive page.