An introduction to BioGPT: part two of The Life Sciences Industry's Guide to Surviving The Rise of the Robots
Arthur Alston MD MBA
Seasoned Life Sciences Adventurer | Enterprise Leader | AI Maximalist | Global Regional Local experience
Recently I started an immersion into AI. I aim to understand the general application of and implications for AI in life sciences and, more specifically, the function I work in, Medical Affairs. I decided to capture my immersion journey as a series of articles. You can find the introduction article here. This is the second article.
Warning: I am already going off-piste comnpared to my original publication plan.
The introduction article led to some unanticipated consequences. My friend Toby Goldblatt contacted me, and we have had significant interactions since then. Toby's LinkedIn headline describes him as a "Founder and Executive in AI". He is worth following and, unlike me, actually knows his AI stuff. Today I want to share with you something I learnt about only yesterday from Toby: BioGPT.
Introduction
My recommendation would be to re-visit my introduction article. Today's article assumes you have a basic understanding of AI, and in particular, you know that various sub-types exist. This is explained and defined in the first article. One AI subfield is something called machine learning and another is natural language processing. AI chatbots like ChatGPT, New Bing and Google's Bard are the results of combining these two subfields of AI. The chatbots have taken the world by storm.
One of the critical aspects to understand when it comes to these AI chatbots is that they are trained on a very, very, very large corpus of information. E.g. all the content of the internet until the end of 2021 is used as a corpus of material for these three chatbots. As a user, we ask the chatbot a question and it generates an answer based on what it has learnt before. Hence the term generative pre-training transformer - GPT. I.e. it takes the text you feed it via the chat interface and transforms what it has learnt by generating a text answer. This is crucial to comprehend as it leads to understanding the strengths and weaknesses of these GPT chatbots.
With me so far?
Therefore one of the downfalls of using these tools is that they learnt misinformation or false information that exists on the internet. Consequently, when you use these tools, they may present results to you that are partially based on incorrect information. The models do (not yet) know the difference between fact and fiction. Maybe they will soon. And their answers are so confident that it may be quite tricky to know when they are wrong about something.
As you can imagine, only a few companies have the resources to scan all the internet content and then feed it into a large language model to train it. Thus far, it is limited to companies like Microsoft, OpenIA and Google. Fortunately for us, these base models can be pointed at different content and you can build a domain-specific GPT model on top of that specific content. If this content is accurate, then the results of the chatbot will be accurate. Sam Altman (the CEO of OpenAI) talks about businesses being built opon these large language models. But there often needs to be another layer between the business case and the original model. BioGPT is such a layer.
Enter BioGPT
BioGPT is a domain-specific generative pre-trained Transformer language model for biomedical text generation and mining. BioGPT follows the Transformer language model backbone and is pre-trained on 15 million PubMed abstracts from scratch.
I'll let that sink in for a moment.
On January 26, Microsoft?announced?that the artificial intelligence (AI) tool BioGPT demonstrated “human parity” in analyzing biomedical research to answer questions.
This video summarises the announcement visually:
The BioGPT model is based on the GPT architecture developed by OpenAI, but it has been further pre-trained and fine-tuned on biomedical text. This allows it to understand better the specialized vocabulary and syntax used in biomedical literature and the specific concepts and relationships between them. Think of it as a layer on top of the large language model.
Now things become interesting.
Potential Use Cases of BioGPT
领英推荐
Biomedical question-answer systems sound cool
The obvious application for us in Medical Affairs is using BioGPT as a question-answer service. The application is also valuable for drug discovery.
Question-answering systems are designed to automatically answer natural language questions posed by users, using a combination of machine learning and natural language processing (NLP) techniques. These systems have a wide range of applications in different fields, including biomedical research and clinical practice.
In the field of biomedical research, question-answering systems can help researchers quickly find relevant information in large volumes of biomedical literature. For example, a researcher might ask, "What are the most common genetic mutations associated with breast cancer?" A question-answering system could use BioGPT to analyze a scientific literature database and return a list of relevant papers and key findings related to the question.
A company could use an EVA (Explainable Virtual Assistant), which uses Bio-GPT to provide researchers with answers to complex questions about drug discovery.
For example, researchers using EVA can ask questions such as "What is the best way to modify this molecule to improve its binding affinity to a specific protein?" EVA will then use BioGPT to analyze an extensive database of molecular structures and provide researchers with recommendations for modifying the molecule. EVA is designed to be user-friendly and can be used by researchers without any prior experience in machine learning or computational chemistry.
Overall, question-answering systems like EVA have the potential to revolutionize drug discovery and accelerate the development of new therapies. By leveraging the power of BioGPT, these systems could help researchers more efficiently navigate the vast landscape of biomedical data and make more informed decisions about which molecules to pursue for further study.
That's all fine for researchers, but what about us in Medial Affairs?
BioGPT for medical affairs
Medical Affairs is a critical function in the biotech and pharmaceutical industry. Amongst many focus areas, it is also responsible for providing medical and scientific information about products to healthcare providers, other external stakeholders and internal stakeholders. One of Medical Affairs' main activities is responding to medical inquiries from healthcare providers, which can be time-consuming and require significant resources.
By using BioGPT to develop a question-answering system, medical affairs teams could streamline the process of responding to medical inquiries. For example, a healthcare provider might ask, "What are the dosing recommendations for this product in patients with renal impairment?" A question-answering system could use BioGPT to analyze the prescribing information and clinical studies for the product and provide the healthcare provider with a detailed and accurate response.
In addition, question-answering systems could be used to proactively identify potential medical inquiries and provide pre-approved responses, which could help medical affairs teams save time and ensure consistency in their responses. For example, if a company is planning to launch a new product with a novel mechanism of action, a question-answering system could be used to identify potential questions that healthcare providers might have about the product and provide pre-approved responses that are consistent with the product label and clinical data.
Overall, using Bio-GPT to develop a question-answering system for medical affairs could help biotech and pharmaceutical companies better serve their customers and stakeholders by providing accurate and timely information about their products.
What are the limitations of BioGPT?
An article in Clinical Trials Arena by reporter William Newton recently asked What is BioGPT and what does it mean for healthcare? He describes how it works and then concludes with a summary of the limitations. Most notably, he concludes that:
Though BioGPT is trained specifically on biomedical literature, it still carries many of the same limitations as ChatGPT—and AI more broadly.
We addressed some of those limitations earlier in the article. We must remember that it is early days yet.
Conclusion
The BioGPT model could transform the biotech industry and improve healthcare overall. It is here now. I encourage you to stay informed about the latest developments in this exciting field.
Can you think of any applications in our industry, specifically Medical Affairs?
These views are my own and are not necessarily those of my company.
Student at DY Patil University - India | Genetic Engineering | GATE 24
1 年A very interesting read, since I basically "re-invented the wheel" by thinking up this same idea (with the same brand name, believe it or not) a few weeks prior. Nevertheless, its great to see an idea fleshed out and well structured in the form of this post, and I truly believe in its potential for good.
Senior Director @ Novo Nordisk | Medical Doctor, Global Clinical Development
2 年Many thanks for both articles Arthur Alston MD MBA. I didn'nt know about bioGPT but it seems that it is MIT-licensed and the code is available on github. As most of our companies use a lot of confidential information, do you see it implemented on a company level sharepoint/databases or at third party providers with confidential agreements and independent servers?
Advisor Ai & Healthcare for Singapore Government| AI in healthcare | 3x Tedx Speaker #DrGPT
2 年Interesting comment from you, let’s focus on the issue. The issue is patients do not fully understand ChatGPT. Some are not aware of the hallucination effect, some are not aware of the biases in the database. My goal is to provide patients with information on how to use this tool in healthcare correctly. This will the next ?“dr Google” .
Still doing now what patients need next
2 年Love the series so far, Arthur Alston MD MBA. Although I was only made aware of ChatGPT following a recent payor discussion, I used it to conduct an impact retrospective of sorts. In essence, I asked ChatGPT to plot out an approach to reimbursement in a healthcare resource-limited setting and then used the output to explore gaps in our approach. We then had a seemingly strange heart-to-heart over what constituted a resource-limited setting (and whether an HIC can or should be treated as an LMIC when it comes to healthcare resource utilization) which ended with notes of encouragement and a statement "to help to ensure that all children have access to the best possible care and support." "Big things have small beginnings", I replied to which ChatGPT added "and it's important to keep striving towards improving healthcare access and outcomes for all".
Advisor Ai & Healthcare for Singapore Government| AI in healthcare | 3x Tedx Speaker #DrGPT
2 年#ChatGPTHealthcare #ChatGpt