Is AI-driven drug discovery really the future of biotech? Part II.
Illustration by Michele Marconi for Nature Spotlight: Biopharmaceuticals. Nature 557, S55-S57 (2018). doi: https://doi.org/10.1038/d41586-018-05267-x

Is AI-driven drug discovery really the future of biotech? Part II.

The first part of “Is AI-driven drug discovery really the future of biotech?” touched on the differences between AI and machine learning and its apparent “recent” surge into the biotech space.?

In part II we will look at where in the drug development process AI is being utilized, who are the big players in the space, as well as touch on some of the pain points.?

It may surprise you (or perhaps not) that AI within drug development is nothing new. The CEO of San Diego based Animantis, Bryan Walker, commented that AI has been operating under the guise of big data, machine learning, deep learning, computational neural networks, cognitive computing, and large language models for the last 20+ years. It’s simply been revamped and given a new name- “AI”.

So, what does that look like? I found the most simplistic and succinct explanation in the most unexpected of places: the BBC. “The key to all machine learning is a process called “training” where a computer program is given a large amount of data - sometimes with labels explaining what the data is, and a set of instructions. The program will then search for patterns in the data it has been given to achieve specific goals. What the program learns from the data and the clues it is given becomes the AI model, and the training material ends up defining its abilities.”?

Now that we have the basics out of the way - let's dive into a bit of the complexities. Where in the drug development process is AI being utilized? The MIT Tech Review notes that the current generation of AI companies are focusing on three key failure points in the drug development pipeline: picking the right target in the body, designing the right molecule to interact with it, and determining which patients that molecule is most likely to help. To expand on that a bit, I’ve come up with 4 “main buckets” AI/ML can fall into (with representative example companies):

To bring some clarity into the buckets, Stephen MacKinnon, VP of digital chemistry at Recursion, elaborates that new biology and chemistry are often interconnected.? New chemistry also involves understanding what each compound is? doing and what effects it’s having within the cellular system as a whole.? A compound’s impact on a living system is the product of all its interactions with biological molecules, including off-target proteins, transporters, metabolizers, etc.????

As Daphne Koller put it in an interview with a16z’s Vijay Pande, the area of drug discovery that is the most advanced in terms of AI/ML development and utilization is taking biological targets and turning them into therapeutic interventions (drug design or selection of combo therapies). “A lot of companies are working on taking targets and converting them to drugs because it is a very well defined problem with a very clearly defined success metric”. Part of the reason for this is because this area in particular has the most data available. Every single person I’ve interviewed for this blog has said that data quality, availability, and integrity is the founding principal AI/ML is built upon. Steven Muskal, CEO of Eidogen-Sertanty put it best when he said “Content is King: AI will help mine from the riches of the vast content to help model and reposition or repurpose already existing molecules. No human being has the ability to pull in all of this information that AI models do”.?

It’s no surprise therefore that AI/ML was initially focused on the small molecule oncology space as small molecules have been around since the dawn of pharmaceuticals and oncology data is more robust and easier to interrogate. It is however not as simple as gathering a bunch of data sets, putting them into an algorithm and teaching that algorithm how to identify what you’re looking for. According to Nick Stock, “AI is only as good as the training data you give it and even within small molecule oncology, the data is not as vast as, say, language data used for ChatGPT for example. Furthermore, the data we have is oftentimes messy and analyzed via different methods. A lot of that training data has to be sifted through and validated by chemists for example, prior to inputting into the AI “system”.

Due to the complexity of the structure of biologics and far less data availability compared to small molecules, the antibody space didn’t see much AI/ML activity until last year with the significant advances of deepmind’s AlphaFold and their recent expansion of the AlphaFold Protein Structure Database from nearly 1 million to over 200 million structures. With the ability to visualize complex protein structures, AI/ML has officially stepped foot into the antibody/biologics space. John Leonard echoes Daphne’s sentiment that discovery is the biggest application utilizing AI/ML within the biologics space. "In the context of discovery, AI/ML facilitates the creation of molecules that exhibit enhanced and disease-relevant binding to identified targets. This includes strategies such as sustained receptor engagement resulting in reduced dosing requirements – referred to as 'long off-rate' and 'on-rate' kinetics that govern rapid efficacy onset. Achieving a long off-rate, enabled by robust affinity maturation or optimization, can lead to prolonged binding and favorable therapeutic outcomes. Noteworthy among biotechnology companies harnessing AI/ML in antibody development are Absci, ProtaBody and BigHat Biosciences."?

Some big questions to ask are: 1) Where can AI be the most beneficial?? 2) What are some crucial challenges being faced?? Bryan Walker comments that the biggest challenges lie before the clinic (eg. Pre-clinical work). “Since target selection, toxicity assessment, and patient selection criteria are all carried out prior to clinical trials being carried out, the idea is finding the right types of data and sorting through it to pick out appropriate trends to better predict/address these pain points. Drug target relevance and toxicity data needs to be collected and we’re not collecting the right data. Unfortunately, I don’t have the answer as to what the right data set is; clinical data or genomics data?” The focus needs to be on getting the right data and cleaning it up.”

Even though it may seem that AI has yet to make a big impact in the clinical trial space, that doesn’t mean efforts haven’t been underway for quite some time now. According to Ali Bashir, head of computational biology and AI at Rezo Therapeutics “Companies have been trying to mine Electronic Health Record (EHR) data for years to improve patient outcomes and Google was one of the first big tech companies to do so. One historical problem is that every EPIC database is different for each hospital. Only recently has there been a systematic effort to standardize data and integrate genomics for better segmentation and stratification of patient subpopulations.?

At this point you might be wondering; have the robots created a drug yet? The short answer is, No! A friend from the corporate side of the industry said that they haven’t seen an AI business model that has been successful, meaning, “the industry has yet to see that gold standard where a molecule has been discovered and developed by AI alone. Until we see a company that discovers, develops and commercializes a full end to end AI drug; AI will become another tool in the tool box that helps improve drug discovery”.

Stephen MacKinnon echoed the same sentiment; “AI is improving steps, not necessarily changing drug discovery. If you aim your AI at critical points, then you can make big improvements within the process”. Ali Bashir adds that in terms of small molecule drug development, traditional computational chemistry modeling is often the preferred tool over an AI counterpart. “AI approaches work best with large training datasets that may not be readily available.? In combination with traditional computational chemistry, these approaches can be used to virtually screen many molecules, but they still require subject matter experts to verify predictions and provide mechanistic insights into why a molecule is working or not.” An AI substitute that is better has yet to be created.?

In essence, at the end of the day AI is statistics and statistics is math- it's just high scale math and massive data curation.?


All sources have been directly linked into the blog post. A sincere thank you to everyone who participated in the making of this article for lending their insight and expertise into the space.

要查看或添加评论,请登录

Partnology的更多文章

社区洞察

其他会员也浏览了