A Hybrid Radiomics-Integrated Machine Learning Framework for Early Identification of Potential Radiation Pneumonitis in Lung Cancer Patients

Author: Christos Ilioudis, Marios Myronakis, Sotirios Raptis, Kyriaki Theodorou 👨‍🔬

Affiliation: Medical Physics Department, Medical School, University of Thessaly, Department of Information and Electronic Engineering, International Hellenic University (IHU) 🌍

Abstract:

Purpose: This study presents a radiomics-driven, machine learning framework developed to predict the possibility of Radiation Pneumonitis (RP), as a side effect of radiation therapy in lung cancer patients. Further, synthetic data generation and self-supervised learning approaches are designed to mitigate challenges resulting from the availability of limited data in RP.
Methods: The innovative feature-driven models integrate texture, shape, and intensity features extracted from CT DICOM images of 2,963 lung cancer patients, with balanced representation of diverse stages and treatments, including cases with and without radiation therapy-induced RP. We developed a novel hybrid machine learning framework that integrates two disparate models into an optimized performance in selecting the most suitable radiomic features. It combines the DenseNet-201 and XGBoost models, utilizing SHAP and permutation techniques for feature importance analysis. A conditional GAN is utilized, empowered with spatial attention mechanisms and radiomics-augmented learning to generate synthetic images tailored for the task. The approach uses a robust fivefold cross-validation methodology and state-of-the-art augmentation techniques, including rotation, flipping, and scaling, ensuring reproducible and reliable results on various datasets.
Results: DenseNet-201 showed the highest classification performance with an accuracy of 92.4%, while XGBoost achieved 89.7%. Key radiomic features, including GLCM entropy and shape compactness, were identified as critical determinants of malignancy. Τhe integrated model presented 94.04% accuracy and a Dice Coefficient of 87.6% for segmentation. Synthetic data generation increased dataset diversity, contributing to a performance boost of up to 15%. Spatial attention mechanisms improved localization precision, showing a strong correlation with radiological annotations.
Conclusion: This study introduces a powerful hybrid machine learning framework designed to address the limitations of RP data through synthetic data generation and optimal radiomic feature selection. This framework achieves robust performance metrics, demonstrating its potential as a clinically relevant tool for predicting RP likelihood, by improving the accuracy and the interpretability.

A Hybrid Radiomics-Integrated Machine Learning Framework for Early Identification of Potential Radiation Pneumonitis in Lung Cancer Patients 📝

Abstract: