The Genomics Revolution – AI & our Health

For most of human history, our average age was only about 26 years old.

We would procreate at age 13, live just long enough to help our children raise their children, and then, on average, die at age 26 (so we were no longer taking food from the mouths of our grandchildren).

It was through technological innovation — sanitation and germ theory — that we moved life expectancy from 26 to the mid 50's. Recently, because of modern medicine’s progress in treating heart disease and cancer, we’ve bumped up today’s global average human lifespan to 71 years.

Reading — Sequencing the Human Genome

Your genome is the software that runs your body.

It is composed of 3.2 billion “letters,” or base pairs, that code for everything that makes you “you” — your hair color, your height, your personality, your propensity to disease, your lifespan, and so on.

Until recently, it’s been very difficult to rapidly and cheaply “read” these letters and even more difficult to understand what they do.

 Data Mining + Genomics: We can now fully sequence millions of individuals’ full genomes, and then mine all of that data to translate what the genome means. Each person’s genome produces a text file that is about 300 gigabytes. When we compare your sequenced genome with millions of other people’s genomes and other health data sets (like your microbiome, metabolome and MRI data), we can use machine learning techniques to correlate certain traits (eye color, what your face looks like) or diseases (Alzheimer’s, Huntington’s) to factors in the data and begin to develop diagnostics/therapies around them.

 N-of-1 Care: This is one of the most powerful and important changes coming in healthcare. When we understand your genome, we’ll be able to understand how to optimize “you.” We’ll know the perfect foods, the perfect drugs, the perfect exercise regimen, and the perfect supplements, just for you. We’ll understand what microbiome types (gut flora) are ideal for you. We’ll understanding which diseases and illnesses you are most likely to develop, and we’ll be able to prevent them from developing (rather than trying to cure them after the fact). Right now “healthcare” is actually “sick care” — your doctor tries to find quick fixes to make you feel better. With genomics, we’ll tackle the root of the problem and eventually eliminate disease all together.

Although genomics is relatively unknown to the general public, innovations in the field have started to make headlines: Genetic testing startup 23andMe, the “gene editing” technology CRISPR and the ambitious 100,000 Genomes Project have all come into the public eye.

Writing — What is CRISPR/Cas9?

CRISPR stands for “Clustered Regularly Interspaced Short PalindromicRepeats.” It is a strand of DNA that was found in 1987 to be part of a bacterial defense system.

The CRISPR/Cas system (Cas stands for “CRISPR associated” genes) was found in prokaryotic bacterial cells to identify and splice *specific/targeted* foreign genetic material that may be harmful to the bacterium.

It turns out that we can actually use this same mechanism to target and splice specific strands of our DNA — in other words, the CRISPR/Cas system is a way to **edit** our genome.

We can remove specific sequences and we can insert specific nucleotide modifications at specific target locations.

Interpreting 23andme Raw Genome Data with Google Genomics and BigQuery

If you want to learn more about your family history, predisposition to illnesses and if you have any level of curiosity about your genetic makeup, a service like 23andme to obtain your raw genome data or you are interested in learning how Google Genomics and BigQuery can help process and draw insights from your genome.

It allows you to browse and download your raw genome data containing your raw genotype data which can give you additional insight into your DNA beyond the data used in the main 23andme service with links to the dbSNP page (example) which can give you a lot of technical detail each DNA marker in your genome.

Genome exploration is still somewhat new but there are a lot of resources and tons of depth to this work. Its really cool and interesting what insights we can pull from our genome. Just look at the popular markers on SNPedia:

With the 23andme raw data is loaded into Google BigQuery you can do a number of things with this data. Resources like SNPedia allows your to query their DB for genotype groups such as for diseases like Parkinson’s or Alzheimers.

Now you can quickly query your genome’s SNPs (“snips”) which are common genetic variations among people.

SNP’s (single nucleotide polymorphisms) are biological markers in your DNA that can help locate genes associated with disease and lifestyle. They are the most common genetic variations between people. SNPs can help track inheritance of disease genes and in the future studies will identify SNPs associate with diseases such as diabetes and cancer.

Synthetic DNA production has already started.

We’ve developed technologies such as liquid biopsies that can detect cancer DNA even in the first several days after its arrival. Researchers also have discovered five epigenetic marks for cancer, and we can expect more to come, when we change the scale of our capabilities and are looking at the molecular level of our physiology, namely through research on the epigenome and epigenetic marks.

While this is undoubtedly a positive step, this level of early detection raises important questions as well. How can we know when a tumor detected at a molecular level will express itself as a true disease? How can we handle issues of “over-diagnosis,” if this proves to be a question?

Answering this type of question comes down not only to technology itself, but also to our experiences with big health data and deep learning. We need more data to be collected and analyzed — a priority that’s been taken on by lllumina’s GRAIL project, whose results are highly anticipated.

Big data also can help scientists closely monitor the efficacy of drugs, as well as detecting very early signs of drug resistance. Major names such as Roche and Johns Hopkins are leading the way, while companies such as Guardant Health are expanding their operations worldwide.

Better understanding of genomics through deep learning and AI

Data is not only the cornerstone of cancer treatment. It also enables us to develop an idea of functional genomics, not just coding. New companies are pushing algorithmic technology forward, such as Deep Genomics (which raised $3.7 million last December) and iCarbonX.

By using deep learning and AI, they hope to give us more applicable knowledge of the human genome. But to do that, they need as much data as possible — more than traditional medicine has been accustomed to using. While it may take time to push this type of data collection and analysis forward, these companies are expected on the market this year, and there is a great amount of excitement around them. In Feb 2018, Dubai Health Authority (DHA) proposed its Genome project for the Dubai 10X Initiative in an effort to upgrade skills and knowledge among medical professionals and forecast the future of human health.

Cloud-based software facilitates faster and better analysis for genomic information

A new solution for genome analysis was brought to market by a collaboration between Broad Institute of MIT and Harvard and industry giants Amazon Web Services (AWS), Cloudera, Google, IBM, Intel and Microsoft. Developed as a software-as-a-service (SaaS) mechanism, the traditional desktop GATK software can now be accessed through the cloud, based upon the Apache Spark computing framework.

Genetic edition — toward a “customized” gene?

Synthetic DNA production has already started, and will continue to grow this year, giving researchers fast, affordable and secure access to any DNA sequences they may need.

CRISPR technology claims it’s already capable of splicing and editing genes, and has shown great improvements in lung cancer and leukemia.

The implications of this technology are almost overwhelming. Essentially, with accessible and affordable CRISPR technology, we will not only be able to treat hereditary disorders and other mutations that give rise to disease, but to even modify hereditary characteristics, make genetic enhancements or perhaps create “customized” embryos.

This idea of human enhancement has given rise to an ethical debate around CRISPR therapies and other similar techniques. This debate is more than important, it’s crucial for the future of humanity; and it’s becoming clear that we will need strong and harmonized international regulations. But these need to be regulations that don’t slam the brakes on the work of researchers who still have so much to discover. They also need to take into account all interested parties: scientists, doctors, patients and even parents.

All these breakthroughs show that the coming year is going to be full of opportunities for innovators, and the medtech industry at large. Everyone, whether patient, medical professional, politician, entrepreneur or investor, has a part to play in the genomic revolution. The societal impacts of such innovations and the need for better data-collection make it crucial for all stakeholders to move forward in a collaborative effort.

Healthcare today is like a repairman who is trying to constantly fix a leaky roof by putting a bucket under the leak. Healthcare tomorrow is like using a scanning device to find the weakest part of the roof and reinforce it before the leak begins.

In the next decade, advances in genome sequencing, data analytics, synthetic biology and stem cell therapeutics will allow us to tackle the roots of the problems.

We are headed to a world without chronic diseases, with longer, healthier lives, and with personalized care for everyone on the planet.

