登录查看更多内容

Deploy a Python Dashboard on AWS

Sreejith Munthikodu

Senior Data Scientist and Data Architect at BC Public Service

发布日期: 2020年6月18日

Recently, I created an interactive covid-19 dashboard in Python using plotly dash. I would like to share the steps I followed to get the app running on an AWS EC2 instance. I also scheduled the EC2 instance to fetch up to date data from the data source.

Deployed app

Project Github Repo

Data Source: CSSE @ Johns Hopkins University

The app

Plotly dash is an opensource framework to build enterprise-ready analytic web apps without having to write javascript code. It empowers data analysts and data scientists to publish their dashboards and data analytics products without having to worry about the complex tasks involved in developing dynamic web apps. Dash supports both Python and R. If you want to quickly learn how to use dash, please refer to this tutorial on the official documentation.

I used dash to build the covid-19 dashboard that I used in this article. Data for the app is obtained from the popular COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The data was cleaned in Python using Pandas and numpy. Plots were created using Plotly Express in Python. This is how the final dashboard looks like.

Deployment

Create an EC2 Instance

The app was created for learning purposes. So I wanted to use a free service to deploy it. Since I am on AWS free tier period, I decided to go with AWS. I used only resources that are eligible for the free tier in this project. I assume you have an AWS account set up already.

Create an EC2 t2.micro instance as the server for this web app. From the AWS management console, under services, click on EC2.
Click Instances and then Launch Instance
Select Ubuntu Server 18.04 LTS as the Amazon Machine Image (AMI)
Select type t2.micro, which is free tier eligible. Click Review and Launch and Launch the instance.
For connecting the instance securely, create a new key pair and download the private key file. Keep this file safe as this will enable anyone to connect the EC2 instance.
Launch Instance
If you go back to the EC2 service, under Instances, you will find the new instance running. This will be the server for our app.

Configure Inbound Rules

It is safer to restrict the access to the EC2 instance only to our IP address. Click on the instance that is running. Under description -> Security Groups, click on launch-wizard-1.
Click on Inbound Rules -> Edit Inbound Rules
For Port 22, select My IP as the source. This ensures that only your IP can remotely connect to the EC2 instance.
We need to open port 80 to enable users to access the app via web. Click Add Rule and add 80 under Port Range. Select Anywhere under Source. Do the same for port 8050 or for whichever port you are planning to run the app on.

Copy the Project to AWS S3 Bucket

Install AWS CLI. This is used to interact with the AWS console from the command line. Follow the instructions here.
Follow instructions here to get an AWS access key.
Configure AWS CLI by typing `aws configure` from your command line
Provide the Access Key Id, Access Key, Default Region Name you obtained in step 2. You may leave the default output format.
Now create an AWS S3 Bucket to store the project by typing `aws s3 mb s3://bucket-name`.
Copy the files from your project to the S3 bucket by typing `aws s3 cp <your directory path> s3://<your bucket name> --recursive`.

aws configure

aws s3 mb s3://bucket-name

aws s3 cp <your directory path> s3://<your bucket name> --recursive

Connect to the EC2 Instance

Right-click on the running EC2 instance on the AWS management console and click connect.
Follow the instructions to connect to the EC2 instance remotely from your command line. For Ubuntu, using the example command would connect to the EC2.

Install Dependencies

Once in the EC2 instance, you need to install the dependencies to run our Python app. The project root directory has a requirements.txt file.
Install pip3 and required dependencies using the below commands

sudo apt-get update

sudo apt-get -y install python3-pip

pip3 install -r requirements.txt

Run the app on EC2

Follow the instructions here to enable S3 access from EC2 instance
Copy the project directory from AWS S3 to the EC2 instance
CD to the project directory and run the app. You should use `screen` to start a detached terminal to run the app so that you can close the connection to the EC2 instance without killing the app. The app should be running on localhost now.

aws s3 sync <local directory path> s3://source-bucket-name

cd <Project Directory>

python3 app.py

If you are not planning the attach the web app to a domain name, you need to tell the dash web app server to run on 0.0.0.0 instead of localhost. This is to ensure that the app can be accessed from anywhere by https://EC2 IP:PORT. This can be done by editing the app.py script.

app.run(host='0.0.0.0', port=8050)

You will now be able to access the interactive web app with http:EC2 IP:PORT.

Enable Automatic Data Update

The data source for this web app is the well known COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The source is updated once every day with summary data from around the world. So we need to configure our EC2 instance to download the data every day to the project directory and restart the web app. Firstly, I cloned the source repo to the EC2 instance. Then I used a bash script to automate pulling the data from the source repository, moving it to the project directory, and restarting the web app. Then crontab is used to schedule running this bash script every day at around 00:00 hours UTC. The bash script I used is given below

#!/bin/bash

# Change directory to JohnHopkins github repo
cd <Path to source repo in your EC2 instance>

# Update repo
sudo git pull

# Remove old data from current project
cd <Project root directory>
sudo rm -rf data/csse_covid_19_time_series*


# Move updated data to project directory
sudo cp -a <Path to csse_covid_19_time_series on source repository> <Path to data directory in project directory>

sleep 3s
# Kill the dash app
sudo killall screen
# Restart the dash app
sudo screen -d -m python3 <Path to app.py>

You may schedule to run this script multiple times a day around UTC 00:00 hours. The data source is usually updated around this time.

crontab -e

15 00 * * * sudo bash <Path to the bash script>
15 01 * * * sudo bash <Path to the bash script>
15 02 * * * sudo bash <Path to the bash script>
15 03 * * * sudo bash <Path to the bash script>

I registered a domain name using AWS ROUTE 53. The app is routed to the new domain name using nginx. I used the answer here to configure nginx.

Disclaimer

This is a project I did to learn how to build a dash app and deploy it on AWS. This may not be the best approach for this application. I take no responsibility if the steps mentioned above lead to compromising your AWS account security. Also, I assume the user is in AWS free tier period. Keeping the EC2 instance running beyond the free tier limit may incur charges.

Moaaz Youssef

Data Scientist

1 年

Can I connect to you?

Dinesh Solanki

Associate Director at ISS | Institutional Shareholder Services

1 年

Nice article!!

1 次回应

William Wei, CFA, FRM

Traversing between data and investment

1 年

this is great. thanks for sharing. one question: do you have to run this everytime the instance is activated? or this can be done by bootstrap (EC2 user data)? sudo apt-get update sudo apt-get -y install python3-pip pip3 install -r requirements.txt

Leandro Amoras

Data Engineer at Porto

2 年

Yago Battaggia

Kelvin Kramp

MD, PhD

3 年

Hey Sreejith, thank for the post! Can you fix the links. The link to the dashboard and the link to the nginx tutorial are not working. Thanks!

查看更多评论

要查看或添加评论，请登录

Sreejith Munthikodu的更多文章

Basic Descriptive Statistics

2019年2月25日

Basic Descriptive Statistics

This article is written in an attempt to help review the basic concepts in descriptive statistics. It may help aspiring…

Deploy a Python Dashboard on AWS

Sreejith Munthikodu

Senior Data Scientist and Data Architect at BC Public Service

The app

Deployment

Create an EC2 Instance

Configure Inbound Rules

Copy the Project to AWS S3 Bucket

Connect to the EC2 Instance

Install Dependencies

Run the app on EC2

Enable Automatic Data Update

Disclaimer

Sreejith Munthikodu的更多文章

社区洞察

其他会员也浏览了

Building Azure Data Factory pipelines using Python

How to Connect Python to Google Sheets

9 Python Libraries Every Developer Should Master in 2024

Python script to retrieve objects from Oracle Cloud Bucket

Harnessing Prompt Engineering to Build a FHIR Server in Python

How to Download and Upload Large Objects (>10GB) Using Boto3 Client in Python

Building Modern REST APIs with FastAPI: Python’s Answer to Spring Boot

PrimeVideo monolith architecture, Python 3.12, MongoDB & more

Microsoft unleashes Python on Azure

Why MongoDB is Used with Python

The app

Deployment

Create an EC2 Instance

Configure Inbound Rules

Copy the Project to AWS S3 Bucket

Connect to the EC2 Instance

Install Dependencies

Run the app on EC2

Enable Automatic Data Update

Disclaimer

Sreejith Munthikodu的更多文章

Basic Descriptive Statistics

社区洞察

其他会员也浏览了

Building Azure Data Factory pipelines using Python

How to Connect Python to Google Sheets

9 Python Libraries Every Developer Should Master in 2024

Python script to retrieve objects from Oracle Cloud Bucket

Harnessing Prompt Engineering to Build a FHIR Server in Python

How to Download and Upload Large Objects (>10GB) Using Boto3 Client in Python

Building Modern REST APIs with FastAPI: Python’s Answer to Spring Boot

PrimeVideo monolith architecture, Python 3.12, MongoDB & more

Microsoft unleashes Python on Azure

Why MongoDB is Used with Python