How to setup remote development and data science environment on Amazon EC2 ?

How to setup remote development and data science environment on Amazon EC2 ?

Recently someone asked me how to setup with a Python Integrated development environment IDE including JupyterLab for a demo.

There were a few steps to be followed to get it all going, like installing AWS CLI, setting up IDE, creating Virtual Environments, installing packages, dependencies and correct IAM permissions to interact with various AWS services.

I did some quick searches but did not get one single complete writeup hence this article.

Please note there are numerous other ways to achieve the same, like using AWS Cloud9, Amazon Workspaces etc, but this article does not compare or suggest this is the best option among others.

What are some of the benefits?

  • Identity is assigned to instance directly and avoiding using long term (and short term) credentials
  • Increased execution speed due to reduced latency external APIs and data sets
  • Shorter code execution and compile time due to access to higher specification EC2 instances compared to laptop or desktop
  • Lower downtime and increased reliability due to automated backeds and quick restore using AWS Backup


Step1: Download and install an IDE

In this example we will choose Visual Studio Code as it has lot of plugins and integrates well with AWS tools.

Download link : https://code.visualstudio.com/download


Step2: Launch an EC2 instance

Instance family, size, region and AMI decisions are yours.

Things to pay special attention to are:

  • Generate or use an existing private key for auth
  • Use a Linux Image (Amazon Linux/Ubuntu)
  • Launch instance in a Public Subnet
  • Allow inbound SSH from the internet (Ideally only from your Public IP address)
  • Allocate and Associate Elastic IP address to your instance to avoid Public IP address changes after stop/start.
  • Very important: Assign a role to the instance that gives it permission to interact with this AWS account programatically using APIs.

You should be able to SSH to the instance using the private key now:

(Note: all resources will be deleted prior to publishing the article for no point trying to break in :))


Step3: Connect EC2 instance from VSS

Using private key for instance in Step2 above.

Update .ssh/config file (Usually in your home directory) to look something like this:

Then connect using VSS

Specify the EC2 instance Elastic IP address, Assuming the .ssh/config file is correctly configured, select that to login to this instance.

If everything has been set properly you should get a successful connection.

Now we will move on to other things needed to get the EC2 properly setup as IDE.

Step4: Configure the EC2 instance

Open terminal from VSS

Run updates and install git

Check if role has been correctly assigned to the instance

Optional : Install poetry for environment management

Lets create a new folder for the project

Using poetry install to create a virtual environment and install dependencies.

Navigate and open project folder in VSS

Install JupyterLab package using poetry

Launch JupyterLab, Note the URLs with Tokens to access the environment

Access JupyterLab using the URL!

Conclusion:

In this article you saw how easy it is to setup a remote IDE on EC2 to accelerate development.

Disclaimer:

  • This article is my personal opinion and has not been endorsed by AWS
  • This article addresses specific use-cases and may not be useful or applicable to everyone
  • Any AWS usage charges incurred by following this article is your responsibility



要查看或添加评论,请登录

Vijay Shekhar R.的更多文章

社区洞察

其他会员也浏览了