Galaxy for AnswerALS on Microsoft Azure and Kubernetes

Happy to share that this contribution to Galaxy has been accepted by the Scientific Program Committee to discuss at Galaxy Community Conference (https://gccbosc2018.sched.com/). This milestone is accomplished with significant contribution from RC Carter (Microsoft), Abhik Ghosh ( Applied Information Systems), Alex Lenail (MIT), Enis Afgan (Johns Hopkins University), Nuwan Goonasekera (University of Melbourne) and Gaurav Hind (Microsoft) - yes, truly Global isn't it?

Whats AnswerALS?

A global project dedicated to developing and implementing a unified strategy to stop Amyotrophic Lateral Sclerosis (ALS) through an aggressively funded agenda. We achieve change through unifying our global community toward agreed upon goals in research, science, technology and education. Answer ALS originated as a result of the 2013 ALS Team Gleason Summit, which brought together leading researchers, patients, caregivers and advocates. The event was spearheaded by former NFL player Steve Gleason who lives with ALS and founded the ALS advocacy group, Team Gleason. The goal of the Summit was to create a plan to find a treatment or cure for ALS as quickly as possible. Within a year after the Summit, that strategic plan was developed and is now, Answer ALS. More insights: https://answerals.org/

Whats Galaxy?

Galaxy is an open source, web-based platform for data intensive biomedical research. The Galaxy Project is supported in part by NSF, NHGRI, The Huck Institutes of the Life Sciences, The Institute for CyberScience at Penn State, and Johns Hopkins University. More insights: https://usegalaxy.org/

What gap did we bridge?

We advanced a Galaxy configuration to execute the Neurolincs ATAC-Seq and RNA-Seq data pipelines in support of the AnswerALS Foundation research plan seeking a cure for ALS. We set out to build a system capable of supporting a variety of workloads and opted for a Kubernetes-based implementation using Helm Charts that would dynamically scale the available compute infrastructure in the cloud (aka burst to cloud, Azure being the first cloud implementation). The developed system relies on current Galaxy capabilities, and is further consistent with long-term support of Galaxy CloudMan 2.0. The system is designed to support the entire corpus of data and compute artifacts from 1,000 ALS patients (over 60 TB of data). This sizable volume of data is being ingested through a combination of upload methods, including FTP, HTTP upload via the Galaxy web user interface, as well as external bulk upload into an NFS volume via tools such as AzCopy. Currently, the system is deployed using Azure Kubernetes Service (AKS) and a dedicated NFS server as a cluster-wide file system.

(Caveat: The Boiler plate code is still being worked upon to enable a single-click "Deploy to Azure" on Github much like existing github templates for Azure)

Website URL: https://galaxy.answerals.net

Code repository: https://github.com/rc-ms/galaxy-azure-k8s-helm-htcondor




要查看或添加评论,请登录

Gaurav Hind的更多文章

  • Google's PPP Lending AI Solution

    Google's PPP Lending AI Solution

    This AI-based solution is rolled out so timely that as I was learning about this, felt it would be useful to share with…

  • How do I start learning or using Azure?

    How do I start learning or using Azure?

    A question I get asked very often, so much that perhaps sharing a blog is perfect. There are several options to get…

  • Webcast: Getting Started with the Cloud in Education

    Webcast: Getting Started with the Cloud in Education

    Webcast now available on-demand Cloud is soon becoming the preferred choice to explore possibilities in Education…

社区洞察

其他会员也浏览了