Scientists at SDAIA & KAUST Develop Multimedia LLM for Radiology Diagnosis

Scientists at SDAIA & KAUST Develop Multimedia LLM for Radiology Diagnosis

Scientists at SDAIA (Saudi Data and AI Authority) in collaboration with KAUST (King Abdullah University of Science and Technology) have developed a multimedia large language model called miniGPT-Med for radiology diagnosis. The model, which uses cutting-edge AI to analyze medical radiology images and textual clinical data, achieved state-of-the-art performance on medical report generation, with 19% higher accuracy than other models. A study entitled miniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis is available on arXiv preprint server. Please see the comments section for clickable links to the preprint.

Study Highlights

  1. MiniGPT-Med is a vision-language model derived from large-scale language models and tailored for medical applications.
  2. Researchers used LLaMA2-chat as the primary language model backbone and adopted the architecture of MiniGPT-v2.
  3. Researchers conducted an assessment of MiniGPT-Med’s performance across various tasks.
  4. The model generated medical reports, answered medical visual questions, identified diseases, discovered location of disease, created medical prescriptions.
  5. Researchers compared MiniGPT-Med with specialist models Med-Flamingo, LLaVA- Med, RadFM, XrayGPT, CheXagent, BioVil, MedKLIP, GLoRIA, MedVINT, OpenFlamingo, and generalist models MiniGPT-v2 and Qwen-VL.
  6. Trained on 124,276 medical images, each with a resolution of 448x448 pixels, with no data augmentation applied.
  7. Entire training was performed on a single NVIDIA A100 GPU over 100 epochs, with a maximum learning rate of 1e-5.
  8. Training duration was approximately 22 hours.
  9. The model integrates processing of both image and textual clinical data which markedly improves diagnostic accuracy.
  10. In medical report generation model surpassed both specialized and generalized baseline models.
  11. MiniGPT-Med demonstrated a significant edge over the leading specialized model, CheXagent with remarkable margins of 21.6 and 5.2 on the BERT-Sim and CheXbert-Simmetrics, respectively.
  12. MiniGPT-Med achieved state-of-the-art performance on medical report generation, with 19% higher accuracy than other models.
  13. The study evaluated MiniGPT-Med using a rigorous human subjective protocol with two senior radiologists.
  14. Radiologist evaluations show that approximately 76% of the generated reports are of preferred quality, highlighting the model’s superiority.

The diverse capabilities by MiniGPT-Med. It can perform disease detection, medical visual question answering, and medical report generation. MiniGPT-Med effectively works with a wide range of radiological data (X-rays, CT scans, and MRIs) and is adept at diagnosing many diseases.
MiniGPT-Med Architecture Overview: The architecture comprises a vision encoder, a linear projection layer, and a large language model. It processes a single medical image, transforming it into visual semantic features via a pre-trained vision encoder. These features are concatenated into a single visual token. A linear projection layer then maps these visual tokens into the large language model’s space. Throughout the training process, we maintain the vision encoder’s parameters constant while fine-tuning the large language model and linear projection layer.


Examples of MiniGPT-Med multi-task abilities include (a) medical report generation and (b) disease detection

Model, Code, and Future Plans

This study demonstrates versatility across various imaging modalities, including X-rays, CT scans, and MRIs, enhancing its utility. Future plans include incorporating more diverse medical datasets, improving the understanding of complex medical terminology, enhancing interpretability and dependability, and conducting extensive clinical validation studies to ensure effectiveness and safety in real healthcare environments. The model and code are publicly available on GitHub.

→ See the comments section for clickable links to paper on arXiv and code on GitHub

References

miniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis

Authors: Asma Alkhaldi, Raneem Alnajim, Layan Alabdullatef, Rawan Alyahya, Jun Chen, Deyao Zhu, Ahmed Alsinan, Mohamed Elhoseiny

Saudi Data and Artificial Intelligence Authority (SDAIA), King Abdullah University of Science and Technology (KAUST)

Subscribe, Comment, Join Group

I'm interested in your feedback - please leave your comments.

To subscribe to the AI in Healthcare Milestones newsletter click here.

To join the AI in Healthcare Milestones Group click here.

Copyright ? 2024 Margaretta Colangelo. All Rights Reserved.

This article was written by Margaretta Colangelo. Margaretta is a leading AI analyst who tracks significant milestones in AI in healthcare. She consults with AI healthcare companies and writes about some of the companies she consults with. Margaretta serves on the advisory board of the AI Precision Health Institute at the University of Hawai?i?Cancer Center @realmargaretta


Raisha Kalayath

Bioinformatics | Computational Neuroscience | Next Generation Sequencing | NLP | Microbiology

4 个月

Very informative, LLMs are the rising stars of AI. In the field of medicine, they can hasten the progress.

Khaleel Abuoudeh

Senior Sonographer @ Abu Dhabi Health Services Company- SEHA.TOT, SCFHS, DOH, DHA, QMS lead auditor, candidate for MBA in Healthcare management .

4 个月
Thomas M Brady

Principal Sojo Consulting, North America & Middle East, VP at Flow Pharma Inc | Life Science Investor | Cancer therapeutics | AI drug discovery | biomedical devices | Author | Speaker | Board member | Venture partner

4 个月

Congratulations Mohamed Elhoseiny PhD. So good to see you in Los Angeles with your family. ??

Dr Nik

The AI Doc I AI Healthcare I MedTech I Healthtech I Digital Health I Data Mining I Robotics I Fastest growing AI in Healthcare Newsletter - theHotBleep I

4 个月

Interesting developments Margaretta, great read!

Margaretta Colangelo

Leading AI Analyst | Speaker | Writer | AI Newsletter 57,000+ subscribers

4 个月

要查看或添加评论,请登录

社区洞察

其他会员也浏览了