A Vision-Language Model for T1-Contrast Enhanced MRI Generation for Glioma Patients

Author: Zachary Buchwald, Zach Eidex, Richard L.J. Qiu, Justin R. Roper, Mojtaba Safari, Hui-Kuo Shu, Xiaofeng Yang, David Yu 👨‍🔬

Affiliation: Emory University and Winship Cancer Institute, Emory University, Department of Radiation Oncology and Winship Cancer Institute, Emory University 🌍

Abstract:

Purpose: Gadolinium-based contrast agents (GBCA) are commonly used for patients with gliomas to delineate and characterize the brain tumors using T1-weighted (T1W) MRI. However, there is a rising concern that GBCA associated toxicity for some patients with renal dysfunction or compromised blood brain barrier. This study aims to develop a deep-learning framework to generate T1-postcontrast (T1C) from pre-contrast multiparametric structural MRI.

Methods: We propose the Text-guided vision transformer (ViT) model which leverages rich contextual information from T1W MRI learned by the (multi-parametric ViT) MPR-ViT model along with textual features. The textual features were derived from clinical data (e.g. patient age, clinical diagnosis, description of tumor volume) together with GPT-4o mini’s slice-by-slice descriptions. Features were then extracted from this textual data by the Bidirectional Encoder Representations from Transformers for Biomedical Text Mining (BioBERT) large language model. The Text-guided ViT model was applied to T1w and T1C MRI images of 501 glioma cases from an open-source dataset. Selected patients were divided into training (N=400), validation (N=50) and test (N=51) sets, respectively. Using T1C as the ground truth, the model performance was evaluated with the peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and normalized mean squared error (NMSE).

Results: Both qualitative and quantitative results demonstrated that the Text-guided-ViT model performs favorably against the benchmark MPR-ViT model achieving the following metrics: NMSE: 6.74 ± 5.57E-4, PSNR: 32.5 ± 2.4 dB, and SSIM: 0.937 ± 0.020 compared to MPR-ViT: NMSE: 6.90 ± 4.49E-4, PSNR: 32.2 ± 2.0 dB, and SSIM: 0.935 ± 0.018.

Conclusion: The presented method generates synthetic T1C images that closely resemble real T1C images. The future development and application of this approach may enable contrast-agent free MRI for brain tumor patients to eliminate the risk of GBCA toxicity and reduce the complexity of the MRI protocol.

A Vision-Language Model for T1-Contrast Enhanced MRI Generation for Glioma Patients 📝

Abstract: