Nobel Prize, Generative AI & Lifesciences: Decoding how they come together and why!
Harini Gopalakrishnan
Ex Global CTO, Lifesciences @ Snowflake | Forbes Council Member & Contributor | Anything lifesciences & AI | AWS Machine Learning Speciality Certified
It's time for the next blog to dig into Lifesciences & AI and what better timing than now to unravel a key topic! In a first of its kind, the Nobel prize in Chemistry for 2024 was awarded to an outcome grounded in Generative AI, for the path breaking work in generating 3D structures of new proteins from their sequences computationally.
Demis Hassabis & John Jumper from Deepmind were co-awarded the prize for the creation of AlphaFold in 2020 with the third winner being David Baker , founder of Baker labs for his seminal work in denovo protein design using AI techniques back in 2003.
With all the buzz around AI and Generative AI and seeing that it is beyond hype with awards of the highest caliber, how much do we know about the different subtypes of models associated with it and the life sciences use cases that are a good fit for each type?
Read on to find out!
What is Generative AI?
What we already know: Re-establishing the context
Generative AI is a deep learning technique that encompasses various models which enable creation/generation of new objects (text, images, code) based on previously learned patterns from existing data. Deep learning techniques
This much most of us would already know , as they have been the buzz word in the last years.
In general a generative AI system is one that learns to generate more objects that look like the data it was trained on and higher the "parameter" the more it resembles objects in real life (like a well written abstract)
Beyond the basics: Diving deep into deep learning techniques
However there are 4 sub types within these category that define how this training and generation is done internally. They are:
Each model varies based on the framework in which it was trained and excels at certain tasks.
This article is to do a quick Level 1 overview of these techniques and allow us to see where in lifesciences they can benefit from. For details on what deep learning and neural networks mean, refer to Appendix at the bottom.
1. Transformer based models
What are they?
Transformers, such as GPT-3 and GPT-4, are designed primarily for natural language processing
Large Language models (LLMs) fall under this category and its name is precisely because of the billions of parameters it can handle during training. While predominantly popular with language generation, transformers can also be used for generation of proteins, chemicals and code (like text2sql).
Transformers are unique because of the self-attention mechanism that processes all parts of the input simultaneously, allowing them to capture relationships between distant elements in the data. This architecture is particularly effective for sequential data like text
Popular Examples in general
In Lifesciences: generation of protein structures and beyond- what is different?
In a simplistic sense within AlphaFold's architecture, transformers are used to capture both the sequence-based and structure-based relationships, utilizing the attention mechanism to learn how residues in the protein sequence interact spatially and structurally. This approach, combined with other innovations, has resulted in AlphaFold's remarkable accuracy in protein structure prediction.
3. Besides protein structures, traditional text based generative AI models have been used in quite a number of other use cases with regards to content generation , including regulatory content authoring for helping with submissions like clinical study report summaries, clinical protocols, MLR compliant documents,etc. In most cases, creation of such nuanced documents will require fine tuning of general purpose models for higher accuracy but we will cover that in a follow up blog. Figure 2 in Appendix has a list of the models and their implementation patterns.
2. Diffusion models
What are they?
Diffusion models generate new data by reversing a diffusion process, i.e., information loss due to noise intervention. The main idea here is to add random noise to data and then undo the process to get the original data distribution from the noisy data. This process generally works well for generation of multi modal data like images, video and audio.
In very simplistic terms, in a diffusion model, there is a two step process in training by which the model learns. The step 1 is to add noise to it and the step 2 is to remove the added noise (or denoise it).
This way, the model learns how to construct the data and it is extremely powerful especially in image generation use cases. They are employed in various applications, including text-to-image generation (as seen in models like DALL-E 2 and Midjourney) and other complex generative tasks.
While traditional diffusion models are not inherently transformer-based, recent innovations have successfully combined them with transformer architectures to enhance their capabilities in generating high-quality images and other data types.
Popular examples
DALL-E the popular image generation model from Open AI leverages diffusion technique for image generation from text prompts. It generates images from text descriptions, allowing users to create unique visual content based on their prompts.
领英推荐
In Lifesciences: Beyond generating images and vidoes
Besides image generation, there are two examples of how this comes to life within lifesciences and both are within the field of drug discovery.
3. Generative Adversarial networks (GANs)
What are they?
Generative adversarial networks popularly called as GANs consist of two neural networks: a generator that creates new data and a discriminator that evaluates its authenticity. This adversarial process helps GANs produce high-quality images, videos, and other multimedia artifacts by learning from a dataset's features. The generator aims to produce data indistinguishable from real samples, while the discriminator goal is to differentiate between real and generated data. This adversarial training process allows the generator to produce increasingly realistic data over time.
Popular uses
GANs are primarily used for generating non textual content or to augment data in the absence of many data points, while transformers excel in text and sequential data processing.
Specific implementations of GANs, such as TimeGAN, focus on generating synthetic time-series data. These models account for time related changes (temporal variations) and correlations in the data, making them useful for applications like financial modeling and supply chain.
In Lifesciences
4. Variational Auto encoders
What are they?
Variational Auto encoders (VAE's) are similar to transformers and comprise an encoder that compresses input data into a latent space and a decoder that reconstructs the original data from this latent representation. While in GANs both the generator and discriminator are trained simultaneously, the training here is done in a step-wise fashion. This architecture allows VAEs to generate new data that resemble the training data, making them a powerful tool for generative tasks. A lot of the embedding creation that we talked about in our previous newsletter, used some form of auto encoder techniques. While the encoder can be used to create embeddings, the decoder can be leveraged for many tasks like denovo sequence generation of a protein or a small molecule or even a new image depending on the nature of the data being trained on.
Popular uses
While Auto encoders are used similar to GANs in generating synthetic data or image generation where the underlying data is sparse or rare, it is easier to train a VAE compared to a GAN due to having a less complex architecture. A significant challenge with GANs is model collapse, where the generator produces a limited variety of outputs instead of capturing the full diversity of the training data. This occurs when the generator finds a shortcut to fool the discriminator, leading to repetitive or similar outputs. Therefore the VAE's are used more stable for such needs. As delineated in the previous newsletter VAE's are also used for embedding creations as they can collapse a complex object into a latent space and where the underlying structure of the data becomes important.
In Lifesciences
In conclusion, I hope this introduction to AI models and their life sciences applications, has opened your eyes to the diverse nature of Generative AI and their real-world impact. Hope this blog sparked your curiosity and expanded your understanding!
"Attention is all you need" was the title of the original paper that led to the evolution of Generative AI and transformers - so stay tuned & "attentive" as we will return with more insights in our next deep dive on AI's transformative role in life sciences!
Appendix
Interesting Reads
freelancer
4 个月secondopinionfromai.com AI fixes this (AI Medical Reviews) AI's potential in life sciences.
Startup Mentor, Product Innovator, Transforming Orgs, Gamechanging Tech - GenAI, IoT, Building Great Teams
4 个月An excellent compendium Harini ! Our Startup intuceo.com does cuttingedge work in GenAI. Would like to discuss with you.
Data For Good
4 个月????
Director of Sales, North Star Scientific A life science sales agency helping brands accelerate growth within the biotech, pharma and CRO space. Quality lead generation is what sets us apart.
4 个月AI revolutionizing lifesciences, transforming drug discovery pipelines.
Snowflake | Data Engineering | Analytics | Data Strategy | AI Strategy | Governance | X-Microsoft | X-Teradata | X-Oracle | TiE Member - Delhi NCR Chapter
4 个月Very informative article Harini Gopalakrishnan