Multimodal Framework for Predicting Radiation-Induced Severe Acute Esophagitis in Esophageal Cancer 📝

Author: Yeona Cho, Chloe Min Seo Choi, Joseph O. Deasy, Jue Jiang, Jihun Kim, Jin Sung Kim, Nikhil Mankuzhy, Aneesh Rangnekar, Andreas Rimner, Maria Thor, Harini Veeraraghavan, Abraham Wu 👨‍🔬

Affiliation: University of Freibrug, Department of Medical Physics, Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology, Yonsei Cancer Center, Heavy Ion Therapy Research Institute, Yonsei University College of Medicine, Department of Radiation Oncology, Gangnam Severance Hospital, Yonsei University College of Medicine, Memorial Sloan Kettering Cancer Center, Yonsei University 🌍

Abstract:

Purpose: We hypothesized that combining clinical, imaging, and radiotherapy dose-distribution features could increase predictive model accuracy in radiation-induced severe acute esophagitis (SAE) in esophageal cancer. This study aimed to develop a multi-modal framework for predicting radiation -therapy (RT) induced esophagitis using a deep learning model.

Methods: This study included a retrospective dataset of 183 EC patients treated at one institution. There were 82 cases of SAE, and for each patient included planning CT, planned RT dose, contours (gross tumor volume/GTV, organs-at-risk/OAR), OAR dose metrics, and baseline patient clinical characteristics such as stage, histology, surgical history, smoking status, and diabetes status. ResNet-50 was used as an image model to analyze the 2D image slices from CT, dose, and fused CT-dose maps. The clinical model first received baseline patient characteristics and dose metrics as inputs. These features were then selected using a least absolute shrinkage and selection operator (LASSO) regression model. The LASSO model ran 1000 iterations with an 80/20 train-test split, selecting features appearing in >80% of iterations. Finally, the image-derived features from ResNet-50 were aggregated with clinical features and passed into a fully connected classifier for final prediction using 10-fold cross-validation. Final model performance was evaluated using the area under the curve (AUC) metric on held-out 10-fold validation samples.

Results: The Dose+clinical model achieved the highest average AUC of 0.82. All multi-modal models (CT+clinical (AUC=0.75), Dose+clinical (AUC=0.82), Overlay+clinical (AUC=0.77)) showed increased AUC compared to their counterpart image-only models (i.e. CT (AUC=0.70), dose (AUC=0.75), and fused CT-dose (AUC=0.70) models). The LASSO-based clinical model performed the lowest among the models (AUC=0.63).

Conclusion: Combining imaging and clinical data improves the prediction of RT-induced esophagitis. These findings highlight the value of multi-modal frameworks for enhancing prediction accuracy and guiding personalized treatment.

Back to List