Disaster Tolerance Patterns Using AWS Serverless Services

Disaster Tolerance Patterns Using AWS Serverless Services

  • May 19, 2019
Table of Contents

Disaster Tolerance Patterns Using AWS Serverless Services

In my previous post (Disaster Recovery for Cloud Solutions is Obsolete) I asserted that you should design your cloud architectures for Disaster Tolerance from the start (even if it is counter intuitive to do so by lean principles). I also argued that you should do this because it’s easy if you do it now, and it will help your business even if there is never a disaster.

The problem is that while that’s all true, in practice there are enough gotchas that what should be easy can take you down a lot of rabbit holes before you get to where you need to be. I’ve recently gone through the exercise for my current startup (Cloud Pegboard) and would like to share those learnings so that you get the benefits of what’s possible without having the go down and back from dead ends in the maze. Okay, here’s our challenge: create a new SaaS service on AWS that delights users, make it highly available even if there is a disaster or failure of the scale to knock out an entire AWS region or an entire service within the region, and do all this with minimal extra effort and expense to create and operate the service.

We’re a startup, so we need to focus most of our attention on delivering user value but are confident enough on our future success that we know we don’t want to create a heap of technical debt that could have been readily avoided with just a little foresight. Disaster Tolerance is the characteristic of a complete operational solution to withstand large scale faults without requiring any (or at least any significant) manual intervention. Disaster Tolerance is fault tolerance expanded to cover disaster level (e.g., region failure) faults.

Disaster Tolerance contrasts with Disaster Recovery which is an approach that reacts to a disaster incident by executing a set of one-time “recovery” procedures to restore service.

Source: medium.com

Tags :
Share :
comments powered by Disqus

Related Posts

Amazon S3 Path Deprecation Plan

Amazon S3 Path Deprecation Plan

Last week we made a fairly quiet (too quiet, in fact) announcement of our plan to slowly and carefully deprecate the path-based access model that is used to specify the address of an object in an S3 bucket. I spent some time talking to the S3 team in order to get a better understanding of the plan. We launched S3 in early 2006.

Read More
When AWS Autoscale Doesn’t

When AWS Autoscale Doesn’t

The premise behind autoscaling in AWS is simple: you can maximize your ability to handle load spikes and minimize costs if you automatically scale your application out based on metrics like CPU or memory utilization. If you need 100 Docker containers to support your load during the day but only 10 when load is lower at night, running 100 containers at all times means that you’re using 900% more capacity than you need every night. With a constant container count, you’re either spending more money than you need to most of the time or your service will likely fall over during a load spike.

Read More
AWS App Mesh

AWS App Mesh

AWS recently released a new service App Mesh during the 2019 summit which has generated a lot of interest from developers world-wide. This service is a great example of how Amazon is highly customer-focused in delivery of products/features to the market. Besides that, there is no additional charge for using the service!:-)

Read More