AWS Case Study 2 - Pandemic Stats App - 100% serverless
Here is my recipe for "baking" 100% pure serverless AWS Web App.
You'll need:
Ready to start cooking?
Let's discuss "The Requirements"
Before we take the deep dive and start designing our Pandemic Stats Web App, let's begin with defining the requirements which our app and its architecture should fulfill.
And since we are in the year 2021, let's make these requirements as extremely demanding as possible:
Our Pandemic Stats Web App should:
Pretty tough most of these requirements, aren't they?
Even though this may look like an mission impossible task to achieve (mainly because of this rule of "<1 EUR operating costs"), it is indeed achievable - and I'm going to show you how.
Let's design "The Architecture"
Based on the requirements noted above, we don't have really many architectural choices available.?
Frankly, we have just one choice - to choose the serverless architecture.
And since we are all fans of AWS (I hope so :), we will host our solution on the AWS cloud platform.
The following AWS architecture I designed may at first sight look to you as quite complex.
However, if we want to fulfill all of the requirements and get the most from AWS and pay the least, this doesn't leave to many alternative choices.
Let's take a look "Under The Hood"
The solution I designed is powered by 2 Lambda functions and 1 CloudFront distribution.
The 1st Lambda function - FetchProcessAndStoreCOVID19Data?
It actually does these 2 things:?
a) it executes SQL query via AWS Athena and fetches the total number of confirmed cases of COVID-19 in the member states of EU. AWS Athena retrieves this data from the COVID-19 data lake located at the public Amazon S3 bucket.?
Do you remember my previous article where I wrote about the AWS Public Datasets? COVID-19 Data Lake is one of these free datasets. It is a centralized?repository of up-to-date related to the spread and characteristics of the novel corona virus (SARS-CoV-2) and its associated illness (COVID-19).???
b) after Lambda function gets the data from Athena, it then parses the SQL query result set it received and pregenerates the HTML content of the webpage that user will see when he visits the PandemicStats.Cloud Web App. This HTML code is then stored in the Amazon DynamoDB table.
As the total number of COVID-19 cases changes every day, we need to make sure our webpage will always display the current figures. That's why we need to execute this Lambda function daily. For this purpose, we will use Amazon Cloudwatch which allows us to setup the scheduled rule that will trigger the execution of Lambda automatically every few hours.?
The 2nd Lambda function - GetWebpageData
Together with the Amazon API gateway acts like a webserver that serves webpage content stored in the Amazon DynamoDB table to the internet browser.?
In this database table we have the HTML content containing the COVID-19 data that our 1st Lambda function pregenerated and we also have there fancy graphics that we display on our webpage.
The role of Amazon CloudFront
There are 3 reasons why we need to use Amazon CloudFront.
1. The first reason is that all Amazon API Gateway REST endpoints (through which we let internet users access our webpage) are designed by Amazon to only listen to the HTTPS protocol requests.?
领英推荐
If we didn't use CloudFront, visitor of our Pandemic Stats web app would have to always type into their internet browsers full HTTPS:// URL of it.
And you know how it works these days, most internet users (myself included) we skip the protocol part of the URL and type just the hostname part.
Without CloudFront, no page would be displayed in such case and users would think our webpage doesn't work at all.?
As we don't want this to happen, we need to make sure both HTTP and HTTPS URLs work. Using CloudFront, it is fairly easy to configure a forwarding rule that rewrites the HTTP URL to the HTTPS one.
2. The second reason why we need Amazon CloudFront is to ensure that no matter where our webpage visitor in the world is, he has always lightning-fast access to our webpage.?
To ensure this, Amazon CloudFront uses a global network of 225+ Points of Presence (215+ Edge locations and 12 regional mid-tier caches) in 88 cities across 45 countries.?
In other words, Amazon CloudFront stores a cached version of our webpage in all of its mentioned network nodes in the world and depending on the user's location it delivers to him the webpage from the nearest CloudFront node.?
3. The third reason is that it saves us money because thanks to its caching capabilities there are just 1-2 API Gateway REST API requests made per day including 2nd Lambda function executions no matter how many webpage visits our webpage a day.
Let's discuss "The Costs"
How it is possible that operating costs of PandemicStats.cloud webpage are just around 60 cents?
It is possible because most of the AWS webservices that we use offer Free Tier programme in which you are not billed unless your usage goes over defined limits.
The AWS webservice which charges us some money though each month is Amazon Route 53.?Our monthly operating costs of +/- 51 cents come exactly from this - we use it as our Domain Name Service (DNS) to host PandemicStats.Cloud domain zone.
Frequently asked questions
1. What about the price of the domain name PandemicStats.Cloud itself? For sure its registration and each year's renewal is not free.
Yes, you are right. However, I intentionally didn't calculate the price of the domain into these operating costs of 60 cents, because anyway, no matter where you are running your website, you always have to pay for the domain name.
Amazon charges for the .cloud domain around 25 EUR/per year. If you pick another domain, you may even get lower pricing.
2. When I take a look at the scheme of the architecture I can see there are only 10 AWS webservices mentioned. Where are the other 2 you didn't mention and are they also needed?
Besides those 10 services, we need additional 2. We need them just initially to help us configure the CloudFront and Lambda.
3. There is a graphical COVID-19 chart displayed on the PandemicStats.Cloud website. How is this technically possible, who/what draws it? Any other AWS webservice you haven't mentioned yet?
No, it has nothing to do with AWS this time.?
The chart with total number of COVID-19 cases is rendered on the client side using JavaScript and Google Chart API which is a free product of Google.?
For more information, visit: https://developers.google.com/chart
4. Showing just one COVID-19 metric on the webpage for website visitors is not really a big deal. What's the real value of PandemicStats.cloud website?
Please bear in mind that PandemicStats.Cloud website was built to demonstrate the power of serverless architecture, possibilities of mutual integration of 12 AWS webservices and cost efficiency of such architecture. It is mostly a proof of concept.
Nevertheless, if one is interested, he can upgrade the lambda functions and offer more features on the website for its users. Possibilities are endless, COVID-19 data lake is a very vast data source.
5. In your previous AWS articles, you published entire step by step tutorial for setting up all necessary settings and AWS integrations. Why there are not mentioned in this article?
I originally wanted to, but as 12 AWS webservices are involved in all of this, this article would be ~ 1km long and it would become difficult for anyone to read it through.
However, for those of you who want to try these things by themselves, please get in touch with me via my LinkedIn profile and I will gladly help you set things up.
If I get similar questions from more of you, I will use it as a signal and write another article(s) in which I can explain stuff in more details.
5X Snowflake Advanced Data Engineer, Advanced Architect, Advanced Administrator ,Advanced Data Analyst, Data Superhero, SnowPro Core, SnowPro Certification SME, Oracle , SIEBEL EIM; https:/medium.com/@sachin.mittal04;
4 年Really great architecture covering multiple services at low cost