登录查看更多内容

Blueprint for building / tuning Foundational Models for Clinical Documentation

Sailesh Conjeti, PhD

AI Product Management | Gen AI | AI for Medical Imaging | Computer Vision | Med Tech | Regulatory | Global Product Lead | Department Manager |

发布日期: 2023年9月17日

By: Dr. Sailesh Conjeti

The integration of AI into healthcare documentation demands a careful and precise strategy. Drawing inspiration from specialized models such as Radiology-GPT, Med-Palm 2 etc., the following blog is being written to highlight some key steps towards building or tuning an LLM model for various clinical documentation tasks across the healthcare value chain.

Here's the detailed blueprint:

Step 1: Data Collection and Preparation

Collection and Compliance: Collaborate with hospitals, clinics, and research institutions to acquire a representative set of clinical documentation such as reports, electronic health records, discharge summaries, prior authorization forms etc.. Ensure that all data adheres to regulations like GDPR, HIPAA, and is fully anonymized.
Data Cleansing and Standardization: Use Named Entity Recognition (NER) to further anonymize any personal details. Extract pivotal sections like "Findings", "Diagnosis", and "Recommendations". Adopt standardized medical terminology systems to maintain uniformity.

Step 2: Data Preprocessing

Text Tokenization: Convert clinical documentation into machine-understandable formats.
Language Refinement: Apply stemming and lemmatization for efficient word recognition.
Data Enrichment: Implement data augmentation techniques, such as back translation or synonym replacement, to increase the diversity and size of your training data.

Step 3: Base Model Selection and Training

Foundation Model: Opt for a pre-existing robust model, such as GPT-3 etc., leveraging its deep linguistic capabilities.
Optimized Training Infrastructure: Create scalable pipelines, preferably with GPUs or TPUs.
Combatting Overfitting / Underfitting: Monitor the training process for signs of overfitting or underfitting, using techniques such as early stopping, dropout, or L1/L2 regularization to mitigate these issues.
Quality Assessment Vectors: It's crucial to evaluate LLM outputs for coherence, comprehensiveness, factual consistency, and harmfulness. This ensures the generated text uphold the necessary standards and prioritize patient safety.

Step 4: Instruction Tuning

Tailored Prompts: Implement instructions like "Compile a comprehensive summary of these clinical findings..."
Advanced Prompt Engineering: Incorporate domain-specific prompts and Chain of Thought (CoT) methods to ensure precise outputs. Ensure translations suit different reader categories.
Classification and Augmentation: Prep the model to spot anomalies and utilize synthetic data for enhanced training.

领英推荐

From EHR Overload to Insights: How AI is Transforming…

SynapseHealthTech (Synapse Analytics IT Services) 2 个月前

CSIRO seeks to harness SNOMED CT and AI to enable…

SNOMED International 4 个月前

Understanding Clinical Decision Support & Data…

EvidenceCare 8 个月前

Step 5: Model Evaluation and Refinement

Automated Assessment Tools: Utilize metrics like BLEU and ROUGE for unbiased quality evaluations.
Healthcare Professional Feedback: Foster a loop with medical experts for feedback, refining the model in tandem with their insights.

Step 6: Continuous Instruction Enhancement

Collaborative Engagements: Engage with healthcare professionals to continuously expand and refine the model's tasks.
Audience-Focused Translations: Develop prompts tailored for varied audiences, ensuring all documentation remains accessible.

Step 7: Deployment and Lifelong Learning

Prioritize Data Security: Deployment should emphasize data encryption and stringent access controls.
Smooth Model Updates: Design an infrastructure that supports continuous model updates without disruptions.
Adaptability: Periodically retrain the model using fresh data, ensuring it stays relevant to evolving medical practices.
Continuous Model Monitoring: Consistently monitor AI-generated documents to uphold quality and patient safety standards.

Recommended Reads

Sun, Z., Ong, H., Kennedy, P., Tang, L., Chen, S., Elias, J., Lucas, E., Shih, G. and Peng, Y., 2023. Evaluating GPT-4 on Impressions Generation in Radiology Reports. Radiology, 307(5), p.e231259.
Wiggins, W.F. and Tejani, A.S., 2022. On the opportunities and risks of foundation models for natural language processing in radiology. Radiology: Artificial Intelligence, 4(4), p.e220119.
Wu, Z., Zhang, L., Cao, C., Yu, X., Dai, H., Ma, C., Liu, Z., Zhao, L., Li, G., Liu, W. and Li, Q., 2023. Exploring the trade-offs: Unified large language models vs local fine-tuned models for highly-specific radiology nli task. arXiv preprint arXiv:2304.09138.
Liu, Z., Zhong, A., Li, Y., Yang, L., Ju, C., Wu, Z., Ma, C., Shu, P., Chen, C., Kim, S. and Dai, H., 2023. Radiology-GPT: A Large Language Model for Radiology. arXiv preprint arXiv:2306.08666.
Lyu, Q., Tan, J., Zapadka, M.E., Ponnatapuram, J., Niu, C., Wang, G. and Whitlow, C.T., 2023. Translating radiology reports into plain language using chatgpt and gpt-4 with prompt learning: Promising results, limitations, and potential. arXiv preprint arXiv:2303.09038.
Wang, J., Shi, E., Yu, S., Wu, Z., Ma, C., Dai, H., Yang, Q., Kang, Y., Wu, J., Hu, H. and Yue, C., 2023. Prompt engineering for healthcare: Methodologies and applications. arXiv preprint arXiv:2304.14670.
Javan, R., Kim, T., Mostaghni, N. and Sarin, S., 2023. ChatGPT’s Potential Role in Interventional Radiology. CardioVascular and Interventional Radiology, pp.1-2.

Disclaimer: The perspectives and insights shared in this blog post are entirely my personal viewpoints and do not echo the beliefs or stances of my employer, Olympus Europa SE & Co. KG. The content of this post has been produced with the aid of ChatGPT, an advanced language model by OpenAI. While I have thoroughly inspected and validated its content, readers should be aware of the AI-assisted nature of this article. The information shared herein stems from my individual understanding and expertise in AI's role within healthcare. It's essential to understand that while this post aims to simplify and provide a foundational grasp on the process of AI integration in healthcare, it might not capture the intricate nuances and challenges of real-world applications. This post is intended as a general introduction to AI in healthcare. For detailed projects or strategies, seeking insights from specialists in the field and complying with pertinent standards and regulations is paramount.

Vijayarajan Alagumalai

Founder & CTO, InnAccel

1 年

Sailesh Conjeti thanks for the blueprint and the recommend reads.

1 次回应

要查看或添加评论，请登录

Sailesh Conjeti, PhD的更多文章

Regulating Generative AI-Enabled Medical Devices - Part 3 - Post Market Monitoring

2024年12月15日

Regulating Generative AI-Enabled Medical Devices - Part 3 - Post Market Monitoring

Disclaimer: This post is a personal synthesis of the discussions from the FDA's Digital Health Advisory Committee…

2 条评论
Regulating Generative AI-Enabled Medical Devices - Part 2 - Risk Management

2024年12月8日

Regulating Generative AI-Enabled Medical Devices - Part 2 - Risk Management

Disclaimer: This post is a personal synthesis of the discussions from the FDA's Digital Health Advisory Committee…
Regulating Generative AI-Enabled Medical Devices - Part 1 - Premarket Performance Evaluation

2024年11月30日

Regulating Generative AI-Enabled Medical Devices - Part 1 - Premarket Performance Evaluation

By: Dr. Sailesh Conjeti Disclaimer:This post is a personal synthesis of the discussions from the FDA's Digital Health…
Demystifying PCCPs: How Predetermined Change Control Plans (PCCPs) are Shaping the Future of AI-Enabled Medical Devices

2024年10月20日

Demystifying PCCPs: How Predetermined Change Control Plans (PCCPs) are Shaping the Future of AI-Enabled Medical Devices

Artificial intelligence (AI) is revolutionizing the medical device industry by enabling continuous improvement while…

5 条评论
Tracing the AI (r)Evolution: FDA De Novo AI-enabled Medical Devices

2024年5月1日

Tracing the AI (r)Evolution: FDA De Novo AI-enabled Medical Devices

Disclaimer: Please note that this list is personally compiled and is not intended to be comprehensive. The compilation…

6 条评论
AI in Gastrointestinal Endoscopy: Insights and Perspectives from Global Surveys

2024年4月14日

AI in Gastrointestinal Endoscopy: Insights and Perspectives from Global Surveys

Disclaimer: This article is a synthesis of my personal views and observations. They do not reflect the views of my…

6 条评论
Transforming Healthcare: A Step-by-Step Guide to Building and Deploying AI Medical Devices

2023年6月25日

Transforming Healthcare: A Step-by-Step Guide to Building and Deploying AI Medical Devices

By: Dr. Sailesh Conjeti, 25th June 2023 Hello, AI for Healthcare Enthusiasists! ?? Artificial Intelligence (AI) is…

8 条评论
Key to Building Trustworthy AI in Healthcare for Everyone, Everywhere

2022年3月23日

Key to Building Trustworthy AI in Healthcare for Everyone, Everywhere

Disclaimer: This is an abbreviated version for a 10 minute read on the document. For a more detailed view, the readers…

9 条评论

See all articles

Blueprint for building / tuning Foundational Models for Clinical Documentation

Sailesh Conjeti, PhD

AI Product Management | Gen AI | AI for Medical Imaging | Computer Vision | Med Tech | Regulatory | Global Product Lead | Department Manager |

Step 1: Data Collection and Preparation

Step 2: Data Preprocessing

Step 3: Base Model Selection and Training

Step 4: Instruction Tuning

领英推荐

Step 5: Model Evaluation and Refinement

Step 6: Continuous Instruction Enhancement

Step 7: Deployment and Lifelong Learning

Recommended Reads

Sailesh Conjeti, PhD的更多文章

社区洞察

其他会员也浏览了

AI vs. Human Touch in Healthcare: Where Do We Draw the Line?

Does Innovation Come With A Compromise? Healthcare says YES: Find Out Why

Generative AI in Healthcare: Unveiling Its True Potential

CONNECT: Abstractive Health is Revolutionizing Clinical Documentation With AI Assistance, Ilya Sutskever's Philosophy on AGI Creation, and more.

Medical AI Scribes In 2025: The Top 3 Practical Benefits

EndoSoft? Receives Award on the Digital Outcome and Specialists 6 Framework

EndoSoft? AI – Live at ACG Annual Scientific Meeting 2024

Boosting Productivity in Healthcare

Generative AI in Healthcare: Automating Medical Documentation and Reporting

Three words for perfect regulatory submissions: Efficiency, accuracy, and consistency

Step 1: Data Collection and Preparation

Step 2: Data Preprocessing

Step 3: Base Model Selection and Training

Step 4: Instruction Tuning

领英推荐

Step 5: Model Evaluation and Refinement

Step 6: Continuous Instruction Enhancement

Step 7: Deployment and Lifelong Learning

Recommended Reads

Sailesh Conjeti, PhD的更多文章

Regulating Generative AI-Enabled Medical Devices - Part 3 - Post Market Monitoring

Regulating Generative AI-Enabled Medical Devices - Part 2 - Risk Management

Regulating Generative AI-Enabled Medical Devices - Part 1 - Premarket Performance Evaluation

Demystifying PCCPs: How Predetermined Change Control Plans (PCCPs) are Shaping the Future of AI-Enabled Medical Devices

Tracing the AI (r)Evolution: FDA De Novo AI-enabled Medical Devices

AI in Gastrointestinal Endoscopy: Insights and Perspectives from Global Surveys

Transforming Healthcare: A Step-by-Step Guide to Building and Deploying AI Medical Devices

Key to Building Trustworthy AI in Healthcare for Everyone, Everywhere

社区洞察

其他会员也浏览了

AI vs. Human Touch in Healthcare: Where Do We Draw the Line?

Does Innovation Come With A Compromise? Healthcare says YES: Find Out Why

Generative AI in Healthcare: Unveiling Its True Potential

CONNECT: Abstractive Health is Revolutionizing Clinical Documentation With AI Assistance, Ilya Sutskever's Philosophy on AGI Creation, and more.

Medical AI Scribes In 2025: The Top 3 Practical Benefits

EndoSoft? Receives Award on the Digital Outcome and Specialists 6 Framework

EndoSoft? AI – Live at ACG Annual Scientific Meeting 2024

Boosting Productivity in Healthcare

Generative AI in Healthcare: Automating Medical Documentation and Reporting

Three words for perfect regulatory submissions: Efficiency, accuracy, and consistency