Research on Glioma MRI Image Generation Based on Large Language Model and Diffusion Model 📝

Author: Xiangli Cui, Chi Han, Man Hu, Wanli Huo, Xunan Wang, Jianguang Zhang, Yingying Zhang 👨‍🔬

Affiliation: Institute of Health and Medical Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Departments of Radiation Oncology, Zibo Wanjie Cancer Hospital, Department of Oncology, Xiangya Hospital, Central South University, China Jiliang University, the Zhejiang-New Zealand Joint Vision-Based Intelligent Metrology Laboratory, College of Information Engineering, China Jiliang University, Department of Radiation Oncology, Shandong Cancer Hospital and Institute, Shandong First Medical University, Shandong Academy of Medical Sciences, China Jiliang University, 🌍

Abstract:

Purpose:
Medical image generation has broad application prospects in deep learning, but the model training effect is often limited due to the lack of real image data. This study aims to explore the combination of large language models and diffusion training-based image generation technology to address the problem of scarce medical image data, especially in the field of glioma image generation. We hope to generate high-quality glioma images that meet medical characteristics.
Methods:
This study combines large language models with diffusion training models to optimize the medical image generation process. Based on the Stable Diffusion framework, U-Net and variational autoencoder (VAE) are combined to generate high-quality medical images. At the same time, the large language model is combined with Stable Diffusion to achieve text-driven image generation, and image generation is guided by natural language descriptions (such as tumor location, size, morphology, etc.). Finally, multiple rounds of training are performed using the glioma dataset, and the quality of the generated images is evaluated by medical expert review and quantitative indicators (such as SSIM and PSNR).
Results:
The trained model can generate high-quality glioma images based on the input text description. These images show a high degree of realism in terms of tumor morphology, boundary clarity, and tissue contrast. Although the generated images have some noise in details, they have generally reached the standards for medical research and further training.
Conclusion:
This study shows that by combining a large language model with a Stable Diffusion model based on diffusion training, glioma images that meet medical standards can be effectively generated. This method provides a new idea for glioma image generation. In the future, this technology is expected to be extended to other types of medical image generation.

Back to List