This one slipped a bit under the radar, but the other week there was a big change for people running Date Lifecycle Manager (DLM). In my colleague Matt’s weekly AWS update on May 31st, he mentioned the “snapshot across multiple EBS volumes” update. On first glance this didn’t seem to be a big deal. Yep, you can now take multi-volume snapshots in a consistent manner. As a backup person, this is good news, but not much more than a footnote. On closer inspection, there is more to it than meets the eye. This is just a small post, but I wanted to highlight the DLM side of the update.
It was only recently that I went to look at DLM and discovered a new option:
This is actually huge! One of the big benefits of tools like Cloud Protection Manager (CPM) is the ability to tag instances rather than just volumes. Apart from being easier to manage, it means that if volumes get added or removed, they will automatically be added/removed from the backup. No worry about getting a call for a restore and discovering the volume was added at a later stage and not tagged.
There is a further AWS blog post on this, covering DLM and more: https://aws.amazon.com/blogs/storage/taking-crash-consistent-snapshots-across-multiple-amazon-ebs-volumes-on-an-amazon-ec2-instance/
I’ve tested out the DLM changes and it works as stated. A big improvement on having to tag each volume. One thing that is a little disappointing is that AWS Backup is still done on an EBS level and does not have the option to just backup the instance. Hopefully this changes soon.
For those of you going, “that’s nice, but what is DLM?”, I’ll give you a quick recap. DLM is a simple backup option, I won’t say “solution”, provided by AWS. It can be found in the EC2 Dashboard under the Elastic Block Store section, and as that suggests, only manages snapshots of EBS.
One of the benefits of DLM is that it is very easy to setup. You give it a policy name, tell it the tag to use, the schedule and retention. The schedule is quite limited, but that also makes it easy to configure. It can only run daily, so no option to do weekly or monthly. The start time is also in UTC, so keep that in mind. Also, it’s not a guaranteed start time, but generally close. Finally, you set the number of snapshots you want to keep, i.e. retention.
Obviously, you also need an IAM role to allow DLM to access and manage the snapshots. If you don’t already have one, DLM will make the appropriate one for you. Again, simple!
One other feature that I think is really nice is the policy summary. Once you have your schedule and retention configured, there is a box that tells you what will happen in (mostly) human terms.
OK, great! We now know how to configure DLM, but what’s the difference between the Instance & Volume options? Why would you use Volume? I’m glad you asked.
Firstly, let me say that while DLM is easy to setup and works nicely, I’d only really use it for test/dev type situations. For production, you really do want a fully fledged backup solution.
OK, back to the question. Let’s assume you have a DB server in your test environment. Ideally, you should send the archive logs to a separate volume (either mount point in Linux or drive for Windows). Let’s also assume you have 24hrs worth of logs. In this sort of situation, you would have two policies. The first policy is based on Instance with a nightly (24hr) backup scheduled AFTER the DB export is done. The second policy would be to capture the archive logs. With the logs on a separate volume, this is where you’d just choose to do a Volume based policy. Run it every two hours (minimum option for DLM) and just keep a retention of 3. If there are 24hrs worth of logs on the volume, you don’t need to keep too many snapshots. So if you need to do a restore, the instance is recovered from the nightly, and the most recent archive logs come from the volume.
In summary, if you are looking for a cheap and easy way to backup your EC2 infrastructure, DLM is definitely worth looking at.