Migration Story: Moving High Scale Data and Compute from AWS to Azure
Background
We recently worked with Emedgene, a company that participated in the Microsoft Partners Program, on developing a next-generation genomics intelligence platform which incorporates advanced artificial intelligence technologies to streamline the interpretation and evidence presentation process. With this platform, healthcare providers will be able to provide individualized care to more patients through improved yield.
Emedgene continues to grow and, given their positive collaboration experience with Microsoft engineers, decided to migrate their solution from AWS to Azure with support from Microsoft. This code story demonstrates how to migrate the compute resources to Azure, transfer more than 100 TB of blob storage and handle application secrets without embedding the Azure SDK in the application code.
Architecture & Migration
A key part of Emedgene’s architecture is the provisioning of new EC2 Spot instances to execute computationally heavy analytics processes that require the input of large sets of genomics data in S3. Each analytics job metadata is enqueued in a queue for processing by the EC2 instances. The number of the instances is dynamic and varies according to the number of messages in the queue.
Compute: EC2 instances are provisioned using another EC2 instance that is monitoring a Redis Queue for additional jobs.
Data: The genomic datasets can comprise over one million individual files with a cumulative size limit of 100TB. In order to perform very fast analytics, Emedgene needs to copy the different sets of files from S3 to the instances attached disks each time an instance is provisioned and gain higher throughput and lower network latency between the instances and the data. In Azure, we will copy these files from Azure Data Lake Store to the VM’s attached disks the same way.
To support native scalability without using another application module, like the solution in AWS that included Redis Queues and additional EC2 instances, we used Virtual Machine Scale Sets (VMSS). VMSS enables us to monitor an Azure Service Bus queue for messages and provision new instances when the queue reaches a certain threshold. Once the application finishes its task, it invokes a script (Self Destroy Instance) that deletes the VM instance from VMSS. The script can be invoked in a Docker container for maximum flexibility in the deployment process.
Note: We considered working with Azure Batch Low Priority VMs but scaling with Azure Service Bus and custom VM images are not fully supported.
The DevOps flow
The Continuous Integration / Continuous Deployment (CI/CD) process is managed with Jenkins. While Jenkins provides a lot of flexibility, we needed a way to provision and manage Azure resources in the pipeline. To do this, we used Azure CLI 2.0 but we also needed to be able to propagate the results from each command to the next such as names and paths.
For example, this code is the result of a CLI command. We want to take the “name” property and propagate it to another command since it is dynamic.
Visit the Microsoft Developer Blog for the rest of the article.
CEO at Publicize and CEO at Espacio Media Incubator
7 年Article by Tomer Rosenthal, ilana Kantorov & Aaron (Ari) Bornstein on the Microsoft Developer Blog.