Building a robust and resilient cloud infrastructure is the fundamental challenge faced by cloud solution architects. In today’s post, I will try to show how to build a fault-tolerant and customer-oriented infrastructure to ensure the provision of services closely and safely to customers. Using Amazon Route 53 service and AWS Cloud Development Kit (CDK), together with Python, I will try to automatically route each client to the nearest (with the lowest latency), healthy point where service resides.
Table of Contents
What are challenges of building latency-based routing solution?
In order to build a routing architecture that is optimized to provide the user with the service that is closest to their location, we need to think about several aspects. First of all, we need to check what is the closest location of our service comparing to user’s location. Secondly, we need to check whether this nearest available service’s location is fully functional. Thanks to the managed DNS service offered by AWS, we are able to implement the above assumptions. The Amazon Route 53 service can check the closest service location and take appropriate actions when a given location is not fully functional. Knowing this, let’s go a step further, i.e. let’s design and implement such infrastructure.
Example latency-based routing solution built with AWS CDK
Above there’s a diagram showing an architecture for example, latency-based routing solution which we’ll build. Under the hood, there’re multiple web servers running on EC2 machines that are serving some content. These EC2 machines are placed in an Auto Scaling group (ASG) which are connected to Application Load Balancers (ALBs) on each region. We want to build a “top-layer” routing solution which will:
- resolve to the same record name for all regions in which the website is available –
blog.cloud.cloudybarz.com
, - check which region (ALB target) is the closest to the user that is requesting connection to the website (latency-based routing),
- in case of a malfunction of a specific region, traffic should be routed to the next, closest and healthy region (ALB target).
Let’s now divide that into smaller parts to understand how it works.
Configuring Amazon Route 53 Hosted Zones
To start working with Amazon Route 53, we need to understand some basic concepts about this service. The first thing we need to create in Amazon Route 53 is the Hosted Zone, which is a container to store our DNS records. In our case we’ll have a hosted zone called cloud.cloudybarz.com
. Here’s how it looks like on the AWS console:
To use such Hosted Zone, you can create a dedicated subdomain and make it managed by Amazon Route 53 service. In my case, my main domain’s DNS provider is Cloudflare, so after creating new Hosted Zone, I need to register it in my provider’s settings.
Let’s see how such Hosted Zone is described with AWS CDK‘s code:
Here we see a simple construct that defines a hosted zone with a specific name and returns its parameters in the output data. What is important is that we must remember that Route 53 is a global service, so to avoid duplicate hosted zones in the case of deployment in several regions, we must make sure that such a hosted zone will be created only once. I ensure this by using is_master_region()
custom-method (sources will be attached later).
Configuring Amazon Route 53 DNS record for latency-based routing
As we have our Hosted Zone ready for DNS records to be placed there, we can start working with that. Let’s assume that we have already created an infrastructure in two regions: eu-central-1 and us-east-1. Let’s start by deploying a record for the Load Balancer of our web server in the first region (in our case eu-central-1).
What’s important in this process is to:
- select A and Alias type of the record,
- set the target of the Load Balancer to the specific region (i.e. eu-central-1),
- set the Routing policy to Latency,
- enable Evaluate target health (this will ensure, that Route 53 be aware of target’s health),
- set the Record ID to something which will distinguish same targets with different regions.
Similarly, we will create a record for our second region.
Let’s jump into the code to see how such record is created with AWS CDK and Python:
When creating a DNS record inside a Hosted Zone, we must ensure that the CfnRecordSet
object is properly defined. We need to give it an appropriate name, type, region to which it refers, type and, above all, set the evaluate_target_health
parameter to True
to make sure that Route 53 will be aware of the health of the target to which a given record directs.
If you want to see the full latency-based routing solution project’s example, please check my GitHub repository, by following this link.
Demo
Before we start
Let’s now check how this “beast” works in a real environment. As mentioned above, I’ve prepared a full infrastructure for described routing solution with AWS CDK to test how it works so now let’s deploy it. You can read a complete “How to start” in the repository’s readme, but here you’ll find a small sneak-peak how to deploy it.
Create Hosted Zone
Please make sure that you’ve properly set your AWS SSO’s main region in constants.py
. In my case it’s eu-central-1. The code is designed that for the first-time deployment it will create only a Hosted Zone in master’s region, because it’ll be needed for the future components such as ALB. Let’s run ./deploy.sh --region eu-central-1
command.
Now it’s the time for you to grab the DNS Names provided in the CfnOutput
and properly set them in your DNS provider’s panel (you must set low TTL or wait for DNS changes to propagate).
Create Web Server target (ALB + EC2) in the first region
Next, let’s deploy ALB + EC2 (web-server) in the first region (in my case eu-central-1). Simply, re-run the same command ./deploy.sh --region eu-central-1
.
Create Web Server target (ALB + EC2) in the second region
Having this all set, let’s deploy our toys to the second region (in our case us-east-1). Let’s run ./deploy.sh --region us-east-1
command.
Simulating normal traffic
Let’s check if everything works as expected. Let’s take the domain we’ve registered in Route 53 (in our case blog.cloud.cloudybarz.com
). I should be routed to the closest server to my location (Poland), which should be eu-central-1 (Frankfurt).
We see that it worked well. I was routed to the closest, operational service’s location.
Simulating the failure of the target (unhealthy state)
Now let’s check how routing would behave, when one region will become nonoperational (unhealthy). For that, I broke target group by making it unhealthy. This should “inform” Route 53 that in specified region, none targets are available (for our case broken region is the one closest to my location).
So we have nothing more to do, but to check if we’ll be routed to other healthy region (us-east-1). Let’s clear the cache and refresh the page – blog.cloud.cloudybarz.com
.
And boom! We were properly routed to the nearest, healthy region 🙂
Summarize
Thank you for reaching out to that place. If you want to know more about AWS and the cloud, check the below posts:
- Decrease cost and boost performance by moving to Graviton
- Autoscaling solution for Amazon ECS Cluster
- Converting infrastructure to AWS CloudFormation and AWS CDK
- Is AWS certification really worth it?