Some tips on deploying AWS Backup

Cross-region and cross-account use cases

The purpose of this article is to share some technical issues I encountered and how to avoid them when deploying the AWS Backup managed service using the “3-2-1” backup rule (at least, to get closer).

There will be no advice on backup strategies, RPO/RTO, or an introduction to or start with configuring the AWS Backup service. On these uncovered topics, I suggest you take a look at the impressive list of over 40 AWS articles; I also added a few other AWS publications that are not referenced on their own list…

AWS Backup is a good managed service for simplified backup of supported AWS resources, but I found that its documentation is not detailed enough for ‘more advanced’ use cases like the one explained in this post. Compared to almost the majority of AWS service docs, that of AWS Backup is not as detailed. Strangely, the API documentation is not on a dedicated subspace (like all other services), but directly integrated into the general Developer Guide.

In terms of usage costs, with small amounts of data to back up, the cost of the service is fairly low.

My use case

My points concern cross-region and cross-account backup of DynamoDB tables and S3 buckets. In addition, I will discuss features to improve monitoring and governance across your entire AWS organization.

For our example, we therefore want to make a first backup of an S3 bucket and a DynamoDB table on a local vault (same region and Account). Then a copy of this backup will be made to a remote vault located on a central Account and another region.

Cross-region and cross-account backup use case HLD

DynamoDB table and S3 buckets resources have the big advantages to support most of AWS Backup features, especially Cross-Region backup + Cross-account backup (both!) and Full AWS Backup management. The latter simplify a lot the encryption mechanism (called Independent encryption).

On the many features, check that your AWS resources to be saved support the same features as these two types given as an example. Furthermore, there may be possible specific restrictions (listed under the table).


Few service terms to understand each other well…

Backup vault: the regional container storing the backups; the latter are named recovery points.

Backup job: the action of backing up an AWS resource to a backup vault in the same account AND region. This job produces a recovery point.

Copy job: the action of copying a recovery point to a another backup vault. Could be local (same account + region), or remote (cross-account and/or cross-region - restrictions apply). This job produces a recovery point.

Restore job: the action of restoring a recovery point from a backup vault to a resource (new, existing or orginal, depending of the resource type). No direct cross-region or cross-account restore is possible.

A shared vault from multiple sources

This might seem obvious, but I couldn’t find it in the documentation: you can use the same backup vault to store different recovery points (backups) that have been copied from different source vaults. This allows you to have a unique shared/central vault for different source accounts, without any impact or limitations. That’s great!

To set this up, pay attention to the following points:

Cross-account management features

With Cross-account management, you can:

  1. Centralize natively all the regional jobs details of all of your AWS Organizations’s accounts in the Management or a Delegated Administrator account.
  2. Enable usage of Backup Policies (managed by AWS Organization service) that differs from a backup plan (see tip below).
  3. Enable cross-account backups (copy jobs).

These three features have to be enabled from the Management Account on two different services.

Trough the AWS Console, they are on a single pane of glass, but if you want to manage it in the good way, meaning with Terraform/OpenTufo/CloudFormation, it’s not obvious where to find them; I summurise them in the below table.

Feature How-to enable it API Endpoint
Backup policies Global setting managed by AWS Organization service. Add BACKUP_POLICY as enabled policy type EnablePolicyType
Cross-account monitoring Global setting managed by AWS Organization service. Add as enabled service EnableAWSServiceAccess
Cross-account backup Global setting managed by AWS Backup service. Set isCrossAccountBackupEnabled UpdateGlobalSettings

Delegated Administrator

I recommend you set up a Delegated Administrator to avoid Management Account usage (no-go zone). You can select up to five delegated accounts.

A delegated admin can see all details of backup, copy or restore jobs across the organization for a specific region up to 30 days.

To set this up, pay attention to the following points:

  • It’s requires Cross-account monitoring (covered above) to be enabled before delegating.
  • Than you can add

Cross-account monitoring console from delegated account

A new monitoring dashboard has been announced on the re:invent 2023.

From a regular account, you can see your own regional jobs; but from a delegated admin account, you have all regional jobs of your organization.

New dashboard from delegated account

The second delegated feature is the (Organization) Backup Policies management. This has been possible since November 2022, when the AWS Organization service made it possible to add a resource-based policy on the organization itself to manage any policies. From then on, you can choose to authorize another account to manage Backup Policies.

The AWS Organization doc shares an example to allow a member account to manage any backup policy.

Backup policies console from delegated account

Advanced DynamoDB backup setting with a Backup policy

For resource type opt-in, it’s crystal clear that resource opt-in settings set on Management Account are used by any backup policy instead of opt-in settings set on the member account where the backup is made. In short, enable the required resource type to back up at Management level to use your backup policy widely.

However, the AWS documentation does not mention advanced DynamoDB backup setting behavior with a backup policy at all.

After tested it (setting disabled on member account), I can confirm that both settings (resource opt-in and advanced DynamoDB backup) have the same override bahavior if they are enabled on the Management Account. This native override behavior save you from adding the setting to all accounts and regions. 💪

Doc states that any AWS Account with a vault created after Nov 21, 2021 has the setting enabled. However, apart from having invited an old Account, your Management Account is the oldest Account of your organization and so don’t forget to enable the advanced DynamoDB backup setting.

Latency for cross-account copies

Be aware that a cross-account copy job could take more time than a backup job.

You should test and increase the scheduling of your copy job for cross-account cases. In my tests, for a small S3 bucket (< 1GB) a backup job always took less than one hour, but sometimes almost 3 hours for a cross-account copy job.

You can set any scheduling (minimum every 1 hour) for your copy job, but you could generate dozens of jobs in parallel, some of which will fall into error.

Backup Plan and Backup Policy are different

Despite their different syntaxes, I thought the capabilities of a backup plan in the AWS Backup service and a backup policy in the AWS Organization service were the same. I was disappointed to learn that this was not the case.

The AWS documentation does not explain that the capabilities of the policies are different and does not clearly state the differences. The doc is even confused by stating that “Backup policies in AWS Organizations combine all of those pieces into JSON text documents.” by referring to the capabilities of the backup plan.

The differences what I was able to test:

  • Backup Plan allows more granulary selection/assignment using tag-based, full ARN or ARN wildcard.
  • Backup Policy allows only tag-based resource selection/assignment.

So if you currently use backup plan with ARN selections, you must use/add ‘backup tag’ on your resources before using a backup policy.

Inconsistency in monitoring fields

To monitor job status in AWS Backup, you can use CloudWatch metrics, EventBridge events, or event with CloudTrail events if required.

By using AWS Backup events streamed into the EventBridge default bus, be aware that some field names are different depending on the job type for the same event type.

AWS doc page show a exhaustive events examples list. But I hadn’t paid attention to this detail on these different fields, and I had to redesign and add EventBridge rules to take these inconsistencies into account.

In the below table, you have the field names for Job State Change events for each type (backup, copy, restore).

Event for Status field names Resource field names Job ID field names
Backup jobs state detail.resourceArn detail.backupJobId
Copy jobs state resources[0] detail.copyJobId
Restore jobs status resources[0] detail.restoreJobId

Depending on how you process these events, you may require to create one EventBridge rule for each job type by matching the events with their specific fields names.

Other lectures

I only coverred few issues! There are others features that require attention like backups encryption…