登录查看更多内容

Interactive Analytics for Very Large Scale Genomic Data

Somalee Datta

Specialization in petascale computing for health and biotech research applications; Broad experience in healthcare research, genomics, drug design, privacy and everything in between...

发布日期: 2016年1月11日

Stanford University, Epidemiological Research and Information Center (ERIC) for Genomics at VA Palo Alto, and Google Genomics in a collaborative effort show use of a low cost database, Big Query, for very large scale variant analytics.

Our manuscript on the pre-print server shows the end-to-end workflow for variant mining. As a pedagogic tool, we show how to run variant QC but the data model supports typical biological queries. Most notably we show scaling and cost effectiveness. Most queries take a few seconds (as opposed to an hour or two on a server or cluster) - this makes data exploration interactive as opposed to batch mode. Interactiveness allows a new flexibility to hypothesis development and testing that can't be achieved by batch mode.

At Stanford, our mission is to bring solutions to researcher, ours and rest of the world, that not only meets workflow requirements, but is easy to learn, easy to manage (doesn't need an army of IT professionals) and is cost effective (can be supported by typical level NIH fundings).

Our solution is accessible to anyone on Google Cloud. But the underlying data models and queries can be replicated using a columnar database like Dremel (e.g. Apache Drill).

Please leave your comments on our methods on the pre-print server.

Madhavi Tikhe

Software Architect

9 年

This system really looks promising.

Quoclinh Nguyen

Bioinformatics & Data Science Professional

9 年

Nice system! How does the system perform on low allele frequency of somatic samples?

查看更多评论

要查看或添加评论，请登录

Somalee Datta的更多文章

Pragmatic gender balance in tech sector

2022年2月26日

Pragmatic gender balance in tech sector

In a nutshell: Partner with a recruitment organization who have outreach in diverse communities. Diverse representation…
Being part of a Hospital IT team at Stanford Medicine

2020年3月8日

Being part of a Hospital IT team at Stanford Medicine

Today, as I prepare my family for COVID19, I want to spend a few minutes thinking about what it means to be part of a…
Paid internship in clinical informatics at Stanford Health Care

2020年2月4日

Paid internship in clinical informatics at Stanford Health Care

My team at Stanford Health Care builds the data platform that our research community uses for collaborative clinical…

3 条评论
The privilege of being part of a diverse gender-balanced tech team

2019年3月26日

The privilege of being part of a diverse gender-balanced tech team

My team of technologists at Stanford Medicine got a front page mention on Stanford Connected last week on account of…

5 条评论
Stanford celebrates launch of Clinical Genomics Service

2018年4月1日

Stanford celebrates launch of Clinical Genomics Service

I am proud to have been part of the team that launched Stanford's Clinical Genomics Service (CGS) this March. What…

2 条评论
Are you a Software Engineer? Want to help address the Opioid crisis?

2018年1月26日

Are you a Software Engineer? Want to help address the Opioid crisis?

Chronic pain affects about 100 million American adults—more than the total affected by heart disease, cancer, and…

2 条评论
"Built for the future."...nah! future is already here

2017年12月25日

"Built for the future."...nah! future is already here

From annals of biomedical breakthroughs at Stanford University: This year I was fortunate to participate in two…
Big Data and supercomputing ...

2017年12月23日

Big Data and supercomputing ...

From annals of Stanford University biomedical analytics: Biomedical Big Data, like any other Big Data, is noisy and…
Wishing everyone an exciting 2018

2017年12月23日

Wishing everyone an exciting 2018

For most of us living in USA, 2017 has been an year of snafu. And we have all had to remind ourselves that…
Join Stanford to Impact Healthcare

2017年12月4日

Join Stanford to Impact Healthcare

My team is bringing a modern approach to building a biomedical data analytics platform. In a traditional approach, the…

See all articles

Interactive Analytics for Very Large Scale Genomic Data

Somalee Datta

Specialization in petascale computing for health and biotech research applications; Broad experience in healthcare research, genomics, drug design, privacy and everything in between...

Somalee Datta的更多文章

社区洞察

其他会员也浏览了

My next chapter and hopefully yours: working on MASSIVELY multi & interdisciplinary problems (MMIPS)

Guy Cochrane, GBC's Executive Director considers the free flow of data across borders

From silo mentality to man-machine superintelligence

Life Science resources/database. How many?

Mathematics and Cybernetics - applied aspects

What is the current thinking in complexity science, applied mathematics, and computational social science on analysis of social media

We need AI in the business of publishing scientific research

The Power of Exponential Growth

Storing a video in DNA.

Structure Preprocessing with Chemical Structure Standardization: A Historical Perspective and Modern Importance

Somalee Datta的更多文章

Pragmatic gender balance in tech sector

Being part of a Hospital IT team at Stanford Medicine

Paid internship in clinical informatics at Stanford Health Care

The privilege of being part of a diverse gender-balanced tech team

Stanford celebrates launch of Clinical Genomics Service

Are you a Software Engineer? Want to help address the Opioid crisis?

"Built for the future."...nah! future is already here

Big Data and supercomputing ...

Wishing everyone an exciting 2018

Join Stanford to Impact Healthcare

社区洞察

其他会员也浏览了

My next chapter and hopefully yours: working on MASSIVELY multi & interdisciplinary problems (MMIPS)

Guy Cochrane, GBC's Executive Director considers the free flow of data across borders

From silo mentality to man-machine superintelligence

Life Science resources/database. How many?

Mathematics and Cybernetics - applied aspects

What is the current thinking in complexity science, applied mathematics, and computational social science on analysis of social media

We need AI in the business of publishing scientific research

The Power of Exponential Growth

Storing a video in DNA.

Structure Preprocessing with Chemical Structure Standardization: A Historical Perspective and Modern Importance