The AI in Pharma Series | 3/3 | Digital Transformation
In our previous articles, we explored the critical importance of data management and the practical applications of AI in drug development and patient care. Now, in this last piece of the AI in Pharma Series, we focus on the transformative potential of language models in the pharmaceutical industry. The rapid integration of large language models (LLMs) across various sectors has become increasingly evident in recent years: how are LLMs transforming the Pharma industry?
In this article, we'll:
I. Quickly deciphering the LLM jargon
Before diving into the LLMs applications in the Pharma industry, let's first define key terms.
Natural Language Processing (NLP) is the foundation, enabling computers to comprehend and work with human language, powering everything from spam filters to voice assistants like Siri. Language Models, the backbone of this technology, come in two flavours: non-generative models like BERT, which excel at tasks such as sentiment analysis and question-answering - focused on the input, and generative models like GPTs, capable of creating human-like text - focused on the output. These generative models fall under the broader category of Generative Models, which also include image generators like DALL-E and voice synthesizers like WaveNet.
Now back to Large Language Models, Foundation Models, such as GPT-3, are versatile AI systems trained on vast datasets, adaptable to various tasks through fine-tuning. Instruction models, like ChatGPT, take this a step further by following specific prompts to perform complex tasks. Fine-tuning then tailors these models for specific applications, like creating a legal language specialist from BERT. RAG (Retrieval-Augmented Generation) enhances these models by incorporating external information sources, similar to uploading documents to ChatGPT, allowing for more informed and accurate responses in specialised domains. While fine-tuning inherently modifies the model by retraining it on additional data, RAG doesn't alter the model itself but provides a flexible way to feed key information from external sources (like documents, tables, etc.) using various retrieval methods, including but not limited to semantic search. As model fine-tuning can be expensive and time-consuming, RAG is a great solution for quickly adding extra-document knowledge without modifying the underlying model - as a reminder, GPT4's training data remains up to December 2023. The effectiveness of RAG heavily relies on the quality and relevance of the information retrieved from external sources, which in turn depends on the quality of the retrieval mechanism employed.
This is a simplified explanation, offering a foundation for the rest of the article.
II. LLM hype and main drawbacks
Certara deep dives into how LLMs are quick learners and are highly scalable and adaptable. However, the current hype surrounding these technologies raises questions about their long-term sustainability and the optimal approach for their implementation, whether it be a few large models or many smaller, specialised models. They mention the infamous Gartner Hype Cycle, to which we should be at some point of the peak.
领英推荐
Despite their impressive capabilities, LLMs have inherent weaknesses when used independently, such as a tendency to hallucinate, lack of access to recent data, inability to provide references, security concerns when processing sensitive client information, and ethical concerns. To address these limitations, the RAG architecture has emerged as a promising solution. By deploying GPTs behind a client's firewall or on a secure hosted server, providing them with access to relevant life sciences reference data and client data, RAG enables LLMs to generate up-to-date referenced responses. Additionally, RAG levels the playing field for smaller LLMs models, allowing them to achieve state-of-the-art performance when combined with use case-relevant data, thus allowing broader access to advanced generative AI capabilities across different LLMs.
III. LLM-based tools for regulatory writing
Numerous stakeholders have developed their own solution, with the example of CoAuthor, developed by Certara - presented by Nick Brown, a medical written helper to accelerate client deliverable for Regulatory Writing. Waheed Jowiya, PhD Louise Lind Skov from Novo Nordisk developed a similar solution, which deep dives into why acceleration document creation brings value and explained how it developed an AI-powered authoring platform designed to streamline the creation of clinical study reports (CSRs) and Common Technical Dossiers (CTDs) in the pharmaceutical industry. The solution utilises an LLM augmented with RAG to query and analyse large volumes of data, facilitating a more efficient approach to writing efficiently the needed documentation. This approach enables a shift from the traditional, time-intensive methods of manual data collection and copy-pasting, to an intelligent mapping where text is placed within the document structure and automatically populates the relevant sections - reducing the time required to create a CSR from 12-15 weeks to approximately one week; resulting in CSR reports being delivered 90% faster. The solution proposes a multi-agentic approach, as it employs various language models, leveraging Amazon Bedrock, or BioGPT - a fine-tuned GPT, which are selected based on specific use cases and requirements. An AI governance tool assists in the selection of appropriate AI models for each use case, aiming to ensure consistency in output quality. The same framework was used for iQNow by Boehringer Ingelheim's strategies for AI-driven knowledge management, as presented by Liliana Monta?o Herrera, where GenAI can streamline the prediction of study milestones and automate workflows, enhancing the efficiency of clinical trials.
IV. LLM-based tools for patient-centered applications
Waracle decided to use this technology to focus on a patient-centered platform, to further understand the patient which first language is natural language - as presented by Mike Miller. The platform aims at improving clinical trials. In collaboration with Roche, they have developed an application called Floodlight MS for multiple sclerosis clinical trials, which utilises games and continuous health data to track disease progression more regularly than traditional methods. GenAI is being used to listen, reason, and communicate with patients, such as in the mental health apps Woba Health or Sonia, demonstrating the potential for AI to interpret data, direct actions, and respond to users in a personalised manner.
Conclusion
The integration of LLMs in the pharmaceutical industry is rapidly transforming traditional processes, from regulatory writing to patient-centred applications. Industry leaders like Novo Nordisk and Boehringer Ingelheim are already reaping benefits, particularly in accelerating document creation and improving clinical trial efficiency.
However, challenges remain, including data hallucination and security concerns. The industry's move towards RAG and secure deployments shows a thoughtful approach to addressing these limitations.
Looking ahead, we can expect more specialised LLMs, increased integration in patient-facing applications, and refined AI governance tools. As the industry navigates the current hype cycle, maintaining a balanced approach will be crucial. The focus must remain on leveraging AI to augment human expertise, whilst addressing ethical concerns and ensuring that improved patient care remains at the forefront of innovation.
?? Large Language Models are revolutionizing the industry by enhancing regulatory writing and improving patient-centered care.
??AI/ML for Healthcare | AI Lead @Vivanti
1 个月For the first article on Data Foundation: https://www.dhirubhai.net/feed/update/urn:li:ugcPost:7228742460015816705/ For the second article on AI in Action: https://www.dhirubhai.net/feed/update/urn:li:ugcPost:7234171466395648001/