Cloud Design Patterns

Snapshot Pattern

Create an S3-backed, point-in-time snapshot of a running instance from the AWS console.

can take seconds or minutes, depending on amount to be saved
snapshots can be used as root volumes on new EC2 instances, creating a near-clone of the snapshotted instance

Stamp Pattern

Create an image of an operating system that is pre-configured for a particular purpose

quickly create multiple instances using the same "stamp"
in AWS parlance, the "stamp" is a custom AMI
Packer and Vagrant are great tools for creating custom AMI
Vagrant is another option but I've never built them that way
AMIs are region specific so you'll need to build several
our AMIs come pre-loaded with Docker and Loggly support

Scale-Up Pattern

The scale up pattern is a method that allows a server to change size and specifications dynamically, and as needed.

fancy way of saying we can move from a smaller EC2 box to a larger one simply by rebooting
possible to automate and schedule the change
only makes sense if CPU or RAM are the bottleneck
does require down time
Terraform can be used to automate this

Scale-Out Pattern

Tie 2 or more instances together to add processing power without incurring downtime

Create an elastic load balancer with forwarding ports and health checks
Create a launch configuration for the instance
Create an auto scaling group with configured CloudWatch alarms and scaling policies
Can react to changes in CPU and other metrics
Will scale up and down, depending on the current environment
Will be smart and remove the cheapest instance to terminate
Can be automated with Terraform

Scale-Out Pattern

On-Demand Disk Pattern

Increase the disk on an already running EC2 instance

requires down time
normally done manually
might be automated
allows for striping via software RAID
the amount of down time required might set off scaling and CloudWatch alarms
can add a new disk and begin using that along with the old one
can add a new disk, copy the data from the old disk, and discard the old disk
can change from magnetic storage to SSD
use of LVM from the beginning can simplify things

Multi-Server Pattern

A variation of the scale-out pattern where availability is the primary concern and is not dynamic

if you need one server to handle a load, install 2
allows for brown outs, hiccups and other anomalies without affecting user experience
shared session state should be stored in ElastiCache
persistent data should be backed by a redundant cluster
same recipe as in Scale-Out but we don't use a scaling group

Multi-Datacenter Pattern

A nuanced version of the multi-server pattern where multiple availability zones are used

load balancers must live in multiple AZs
EC2 instances must live in multiple AZs
managed services, such as RDS, support multiple AZs
by-hand services, such as MongoDB, need to account for multiple AZs as best they can

Floating IP Pattern

Provides a way to apply zero downtime updates when a load balancer is not being used.

Create an EC2 instance that serves HTTP content to an end user
Assign a floating IP address to the EC2 instance
Create a secondary EC2 instance that serves HTTP content to the end users
Swap the floating IP address to the secondary EC2 instance
Perform modifications to the original EC2 instance
Swap the floating IP address back to the original EC2 instance once modifications are complete

Deep Health Check Pattern

The deep health check pattern lets the instances connected to the load balancer notify the load balancer of health checks beyond the grasp of the load balancer itself.

Only detects downstream failures, doesn't repair them
Example downstream services include databases and Google API calls
JVM projects typically have this pattern baked in via Spring Boot

High Availability Storage Pattern

Leverage S3 to store static text and binary web assets, letting Amazon deal with the redundancy, encryption and failover

upload static assets to an S3 bucket
create an S3 bucket policy that allows internet users to access and download the assets
use EC2 to run a web server instance that hosts the links to the assets
configure IAM accounts if particular security constraints are desired

Direct Storage Hosting Pattern

For static web sites, remove the web server from the Direct Storage Hosting pattern and let S3 serve up the entire site.

any Javascript hosted in this manner will have to use CORS in order to talk to services (they won't live in the S3 domain)

Private Data Delivery Pattern

Secure access to S3 assets using a time-sensitive URL

expiring URLs are a feature of S3
time limit is configurable

Content Delivery Network Pattern

Ensure that data does not have to travel far to reach the end user, reducing latency and improving the end user experience.

CloudFront is Amazon's CDN that does this
Content is optimized either by latency or geographic location
The URL is the "key" used in CF's caching mechanism

Rename Distribution Pattern

Gets new versions of data out to the user in such a way as to ensure that old data is not being used. Also known as "cache busting"

lower the cache timeouts in CloudFront
use expiring S3 URLs
new URLs means the CloudFront will have new "keys" into its cache and won't serve up old content
the book was light on implementation details other than "use a URL shortening service to make things easier"
Spring provides cache busting out of the box

Clone Server Pattern

Dynamically created clones of a static master server are created as needed.

create a single master instance that never goes away
put a load balancer in front of the master instance
create an AMI based on the master instance
create a launch configuration for the secondary instances based on the AMI
spin up secondary instances using the launch configuration
use rsync to copy current data from the master server
register to the new instances with the ELB

NFS Sharing Pattern

Similar to the Clone Server pattern but we replace rsync with NFS

use of NFS ensures each instance sees the same data on the file system
HDFS or GlusterFS are viable alternatives to NFS
Amazon Elastic File System (EFS) is a managed version of this solution

State Sharing Pattern

Share session information between applications via a fast, in-memory cache.

Redis and MemCached are typical solutions
Amazon's ElastiCache is a managed service of those two APIs
data is looked up by key, usually a session identifier
data can be aged out using TTL settings

URL Rewriting Pattern

Improve upon the Clone Server and NFS Sharing patterns by storing data in S3.

files are uploaded to an application
the application stores the data in S3
when a client asks for the data, the server returns a 301 status code and redirects them to the S3 URL
no longer have to manage file systems

Cache Proxy Pattern

If CloudFront is taking too long replicating your data out to its edge nodes, cache the data yourself

use Varnish or Squid to cache the data
unclear if the author intended for the application and cache pair to run in different regions around the globe
for availability, need to run a cluster of caches
author suggest running a Consul cluster to help with Varnish cluster registrations
author also suggested having Varnish use S3 for its backing store and possibly avoiding having to run a cache cluster

Write Proxy Pattern

Instead of using HTML form uploads to add data, use alternative protocols to upload the data and have the web front-end read it back from S3.

FTP, SCP, HTTP PUT/POST, UDP
run an EC2 instance to handle the uploads via the protocol of choice
the instance ultimately stores the data in S3
front-end server fetches data either from S3 directly or from a cache backed by S3

Storage Index Pattern

To avoid unnecessary S3 API calls, and the cost associated with them, keep and index of meta-data associated with each uploaded item.

authors uses Amazon's RDS to store meta-data but you can use anything you want
the write proxy (upload server) will also write meta-data to the index
the front-end serving the data consults the index for necessary information, including the direct S3 coordinates

Direct Object Upload Pattern

Leverage S3's robust HTTP upload support and allow direct uploads to S3.

provide a front-end that hosts an upload form
the front-end does some data validation and constructs an S3 upload URL
the client is redirected to the generated URL and the bits are uploaded directly to S3
after the upload is completed, the client gets redirected to whatever completion page you want

Database Replication Pattern

Create two instances of your database and use its master-slave support to deal with network partitions and server crashes

create a master MySQL instance in one AZ
create a slave MySQL instance in a different AZ
let MySQL replication logic keep the data in sync
using Amazon RDS is probably a better choice
not designed to increase capacity
propagation only occurs from master to slave

Read Replica Pattern

Variation of the Database Replication pattern where the secondary is used for read-only operations

set up the master and slave instances
install MySQL Proxy on the master instance and configure it to route read-only operations to the slave node
HAProxy, MySQL Fabric and Galera are other options
Amazon RDS is probably a better solution

Read Replica Pattern

In-Memory Cache Pattern

Applications avoid hitting the database by caching previous query results

Redis and other key-values stores can be used
Complexity added on the application side
ElastiCache provides Redis API
DynamoDB is a key-value store

Sharding Write Pattern

Improve the Read Replica pattern by sending writes to multiple databases

install MySQL Fabric to handle the details
carefully select a sharding key -- hard to change later
can improve write throughput
may impact read throughput -- data lives in multiple places
Amazon RDS might be a better choice

Sharding Write Pattern

Queuing Chain Pattern

Move work through the system via a series of message queues.

in the old days, we called this SEDA (staged event driven architecture)
use SQS to handle message queuing
incoming message is pulled from queue A, processed and results are published to queue B
pattern is repeated until the final solution is produced
can probably levarage Lambda to for processing pieces

Queuing Chain Pattern

Priority Queue Pattern

Variation of the Queuing Chain pattern where certain queues are considered more important than others and get priority attention

message consumers monitor multiple queues
higher priority queues are always serviced first, lower priority queues when there is no higher priority work
requires 3rd party library to provide priorities to SQS queues
maybe Lambda can work here?

Priority Queue Pattern

Job Observer Pattern

Dynamically scale message consumers by leveraging CloudWatch

similar idea to Scale Out pattern
add a Cloud Watch alarm to the priority queue
if the queue depth gets too deep, the auto-scaling group gets triggered and more consumers are created
once the queue depth gets reduced, the auto-scaling group can destroy the extra consumers
can be automated via Terraform
Lambda might obviate this pattern

Job Observer Pattern

Bootstrap Pattern

Enhancement to the Stamp pattern where customizations are applied at instance start up

EC2 has the notion of "user data" which are Bash scripts run at boot time
I've used it to start important Docker containers, such as Nomad agents, at boot time
can use S3 to grab the most current logic and data, such as a digital certificate or instance launch script
amount of work done impacts start up time
unclear what happens when the script fails

Cloud Dependency Injection Pattern

Variation of the Bootstrap pattern where the boot script behavior is controlled by the context of the instance being launched

the AMI used is general purpose
the EC2 instance has special meta-data applied to it (tags)
at boot time, the instance consults those tags to determine where in S3 it should obtain its start up script
allows for changes in instance configuration without having to bake a new AMI

Stack Deployment Pattern

Use CloudFormation to generate your application stack from a customized template or existing catalog of templates

templates are JSON files
similar idea to Terraform
the entire stack can easily be destroyed with a single command
unclear how rich the templates can be

Monitoring Integration Pattern

Use CloudWatch to monitor your infrastructure and alarm you when things become abnormal

Amazon only retains 2 weeks of data so it isn't suitable for an operations database
other tools, such as Nagios, Zabbix and Cacti are other tools in this space
author does not offer any concrete examples but does mention security audits, QoS aggreements and deep health checks as reasons to put monitoring in place

Web Storage Archive Pattern

The idea that logs are rotated out of the instance into S3

author says that there are hundreds of tools that help in this space
Loggly is one of those tools
S3 provides for aging of old logs to cold storage (Glacier)

Weighted Transition Pattern

When rolling out new bits, direct a small portion of the traffic to the new bits to prove them out

old system and new system run at the same time
Route53 is configured to route a small percentage of traffic to the new system
monitor the new system for abnormalities
slowly move all the traffic to the new system
the hard part is merging the old database into the new database
might be automated by Terraform

Weighted Transition Pattern

Hybrid Backup Pattern

Not covered in depth but the idea is to backup on-premises application data to the cloud

use S3 for fast storage of data
use Glacier for cold storage
S3 also has a "less frequently accessed" mode
data can also be encrypted at rest

On-Demand NAT Pattern

For security reasons, keep your instances off the internet and in a private subnet. Use a NAT router to temporarily enable internet access as needed.

place EC2 instance in a private subnet
bring up another instance to use as a NAT gateway
internal instances route public traffic through the gateway
turn off the NAT instance when done
can't get to S3, ELB and any other AWS services that require traversal via the internet
controlling the Internet Gateway might an alternative

On-Demand NAT Pattern

Management Network Pattern

Use dual NICs to control data meant for the internet and data meant for the corporate network

sometimes called a backnet or management network
easier to understand the flow of traffic and apply security groups
useful in migration because cloud-based assets can contact on-premises assets
requires a VPN on the management network

Management Network Pattern

Functional Firewall Pattern

Stack individual security groups to achieve firewall rules that are easier to manage

insecure http (80) from internet to internal network
secure http (443) from internet to internal network
MySQL (3306) from instances using either the insecure or secure http group
Redis (6379) from instance using either insecure or secure http group
reuse individual groups and stack them, keeping things DRY

Functional Firewall Pattern

Operational Firewall Pattern

Stricter version of the Functional Firewall that restricts the incoming traffic from a specific source

restrict to a network
restrict to a specific IP
restrict to a particular client certificate

Operational Firewall Pattern

Web Application Firewall Pattern

Not cloud-specific but a general approach.

use a web application firewall to understand the context of the data being transferred, eg ModSecurity
stateful packet inspection rules are not good enough
firewalls live outside of the instances they are trying to protect
author light on specifics

Web Application Firewall Pattern

AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources. AWS WAF gives you control over which traffic to allow or block to your web applications by defining customizable web security rules. You can use AWS WAF to create custom rules that block common attack patterns, such as SQL injection or cross-site scripting, and rules that are designed for your specific application. New rules can be deployed within minutes, letting you respond quickly to changing traffic patterns.

Web Application Firewall Pattern

Multi-Load Balancer Pattern

Variation of the Operational Firewall

use multiple ELBs
each ELB does TLS termination (new certificate manager makes it even easier to do this now)
each ELB has client-specific security groups applied to control inbound traffic
instances behind the ELBs use the simpler insecure HTTP
standard security group stacking is applied to ensure instances are properly restricted

Multi-Load Balancer Pattern

Infrastructure As Code Pattern

Apply the same techniques to operational concerns that you do to production code

automate and store scripts in source control
CloudFormation to construct an entire stack
Packer to build AMIs, Docker Container and Droplets
Fugue is a new YML based tool aimed at AWS

Temporary Development Environment Pattern

Use virtualized environment coupled with automated AWS scripting to develop in near production environment

if you deploy in Linux/AWS, develop in Linux/AWS to avoid "gotchas"
Vagrant can spin up VirtualBox
Terraform can spin up a private, temporary AWS area
keep control logic under source control
CI environments should use something similar

Continuous Integration Pattern

Leverage AWS to run your CI builds

spin up build agents as needed
alternative is to auto-scale them during working hours
can keep the master node on premises
gives you the ability to build and test under different environments and configurations
unsure if we have enough Bamboo licenses to make this viable

Cloud Design Patterns for AWS

By Ronald Kurr

Cloud Design Patterns for AWS

A summary of the patterns described in the book "Implementing Cloud Design Patterns for AWS"

1,946

Ronald Kurr

Long time software developer.

Cloud Design Patterns

Snapshot Pattern

Stamp Pattern

Scale-Up Pattern

Scale-Out Pattern

Scale-Out Pattern

On-Demand Disk Pattern

Multi-Server Pattern

Multi-Datacenter Pattern

Floating IP Pattern

Deep Health Check Pattern

High Availability Storage Pattern

Direct Storage Hosting Pattern

Private Data Delivery Pattern

Content Delivery Network Pattern

Rename Distribution Pattern

Clone Server Pattern

NFS Sharing Pattern

State Sharing Pattern

URL Rewriting Pattern

Cache Proxy Pattern

Write Proxy Pattern

Storage Index Pattern

Direct Object Upload Pattern

Database Replication Pattern

Read Replica Pattern

Read Replica Pattern

In-Memory Cache Pattern

Sharding Write Pattern

Sharding Write Pattern

Queuing Chain Pattern

Queuing Chain Pattern

Priority Queue Pattern

Priority Queue Pattern

Job Observer Pattern

Job Observer Pattern

Bootstrap Pattern

Cloud Dependency Injection Pattern

Stack Deployment Pattern

Monitoring Integration Pattern

Web Storage Archive Pattern

Weighted Transition Pattern

Weighted Transition Pattern

Hybrid Backup Pattern

On-Demand NAT Pattern

On-Demand NAT Pattern

Management Network Pattern

Management Network Pattern

Functional Firewall Pattern

Functional Firewall Pattern

Operational Firewall Pattern

Operational Firewall Pattern

Web Application Firewall Pattern

Web Application Firewall Pattern

Web Application Firewall Pattern

Multi-Load Balancer Pattern

Multi-Load Balancer Pattern

Infrastructure As Code Pattern

Temporary Development Environment Pattern

Continuous Integration Pattern

Cloud Design Patterns for AWS

More from Ronald Kurr