登录查看更多内容

Can generative AI produce realistic medical images?

Subrata Das

Gen AI Professor & Principal AI & Data Scientist

发布日期: 2024年1月2日

The question above was posed to the students of my Generative AI class for graduate students at Northeastern, which resulted in a 3-week hands-on project assignment. I wanted to focus on images of something easily visible to the naked eye and thought that a skin rash could be a good example. The immediate concern I had was if the students would be comfortable looking at these images, some of which might cause discomfort, both mentally and physically. Remarkably, barring one or two, most were not concerned at all, but nonetheless, I allowed them to choose from the milder categories if they wished to.

If such an image generation application is in place, then I'm envisioning that physicians would be able to query the system as, for example, "Generate an image of a ringworm rash at the back of the neck of a dark-skinned person," for an informed decision aiding process. Observe the variability as there are hundreds of different combinations of the types of rashes, the areas of the body, and the skin tones. If we are more adventurous, the temporal progression of such an image over a period of time given the patient's current condition can also be explored.

There have been many research and commercial text-to-image models to generate images from text prompts. Here are some of their core conceptual foundations extracted from the published literature. VQ-VAE (Vector Quantized Variational Auto Encoder) uses autoregressive models to learn an expressive prior over a discretized latent space combining textual and image semantics. Different from VQ-VAEs is a combination of the generative capability of VQGAN (Vector Quantized Generative Adversarial Network) and discriminative capability of CLIP (Contrastive Image-Language Pretraining). VQGAN employs a first stage with an adversarial and perceptual objective to learn an intermediary representation using a codebook before feeding into an autoregressive transformer.

Naveen Joshi 5 年前

Can AI Replace Human Analysts? Exploring the Future of…

GoAskNow Technology Private Limited 4 个月前

AI and Medical Diagnosis

Naveen Joshi 6 年前

A (probabilistic) diffusion model is a parameterized Markov chain trained using variational inference (and U-Net) to produce samples matching the data after a finite number of steps. Diffusion Models learn to generate data by reversing a gradual noising process. Decoupling image generation from the implicit spatial biases of convolutions has allowed text-to-image models to reliably improve via the well-studied scaling properties of transformers. The VAE encoder part of these architectures compresses the image from pixel space to a smaller dimensional latent space, capturing a more fundamental semantic of the image.

One question that naturally arises is why not use DALL-E or Stable Diffusion as zero-shot to generate images from natural language queries, much along the line of their earlier ChatGPT-based chatbot development assignment. The answer is two-fold. First, this is an educational assignment for students to understand underlying models as opposed to how to use an off-the-shelf application. In fact, some students went for few-shot learning and some make use of the diffusion model to fine-tune with some training images. Second, the accuracy of DALL-E is below par for this specific use case. When the ringworm rash generation request is posted to DALL-E 2, the generated images are anything but ring-like, though the back and neck and dark-skinned features were accurately generated. The Stable Diffusion online application has a similar performance.

Some other alternative textual formulations of the request may generate images to satisfy the needed requirements, but the safety-critical medical environment calls for a more accurate image generation environment as per the semantics of a valid query. Students' efforts were hindered of course by the lack of training data and limited computing power and time. There were many novel ideas from students floated around, such as the transfer learning from one rash type to another or one rash type on one skin color to another. Needless to say, my grading was largely based on their thinking in the right direction as opposed to a complete working system.

Sri Krishnamurthy, CFA, CAP

9 个月

Subrata Das, I did a research project in the area with another student asking the same question in 2021 (pre-ChatGPT :))! Here is the paper you may find interesting! Lu, Z & Krishnamurthy, S (2021). SkinGAN: Medical image Synthetic data generation using Generative methods Paper: https://www.slideshare.net/QuantUniversity/zijiasri-skinsynthesizepdf Code: https://github.com/ZijiaLewisLu/SkinGAN

2 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Can generative AI produce realistic medical images?

Subrata Das

Gen AI Professor & Principal AI & Data Scientist

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Machine Learning for Medical Imaging Analysis: A Comprehensive Overview [with Datasets Map]

The Sherlock Holmes of Medical Imaging AI: How Saliency Maps Reveal More Clues

The Role of AI in Transforming Healthcare: Why Superintelligence is Essential

The Riddle of Tech and AI in Healthcare: Is Technology the Barrier?

AI With No Human Intervention? Brain Implants, YouTube Prompts, and More!

Fine-tuning models

Medical Devices - AI, BI, Gen Z, IR 4

Revolutionizing Communication: AI-Enhanced Brain-Computer Interfaces Offer New Hope

Health-Care Futurism Disruption by coming Image Generation, Augmentation, AI-tech

Artificial Intelligence In Healthcare

领英推荐

Deduction in ChatGPT

2023年1月30日

Systems Engineering in Building Complex AI Systems

2021年3月16日

Factors inhibiting AI adoption

2019年11月5日

Analysis of Text (aText) Tool in Python and Java

2019年10月29日

Categories of data scientists – where do you want to be?

2019年9月16日

The Death of True Intelligence?

2017年5月11日

Computational Business Analytics

2016年12月12日

Internet of Things - critical roles of data fusion, analytics, and intelligent agents

2015年12月11日

Time Series Modeling and Forecasting

2015年10月15日

A text analytics view of extracting actionable business intelligence

2015年8月4日