A Novel Feature Selection Method for Survival Prediction of Head-and-Neck Following Radiation Therapy πŸ“

Author: Xiaoying Pan, X. Sharon Qi πŸ‘¨β€πŸ”¬

Affiliation: Department of Radiation Oncology, University of California, Los Angeles, School of Computer Science and technology,Xi'an University of Posts and Telecommunications 🌍

Abstract:

Purpose:
Survival prediction for cancer presents a substantial hurdle in personalized oncology, due to intricate, high-dimensional medical data. Our study introduces an innovative feature selection approach to curate an optimal feature set, aiming to enhance predictive accuracy.
Methods:
Our proposed selection method, Enhanced Genetic Algorithm with Lasso (EGA-Lasso), integrates genetic algorithm (GA) with regularization and Lasso regression for high-dimensional feature sets processing. A fitness function: AUC_scoreβˆ’Ξ±Γ—feature_ratio , where feature_ratio represents the proportion of selected features, and Ξ± controls selection sparsity. This regularization encourages feature selection with fewer features while optimizing model performance. The algorithm undergoes iterative GA optimization, with a grid search to determine optimal Ξ±. Lasso regression further refines the feature subset, mitigating multicollinearity and identifying key features.
We evaluated EGA-Lasso using 427 head-and-neck cancer patients (TCIA), stratified into short-term (40.5%) and long-term (59.5%) survival groups based on a five-year survival threshold. For each patient, a comprehensive feature set was constructed from CT images, comprising 1,595 radiomics features (via Pyradiomics), 512 high-level deep-learning (DL) features (from ResNet18), and 17 clinical features. Various ML classifiers were trained and tested using both the selected and original feature sets. Performance was assessed through k-fold cross-validation using metrics such as AUC/Accuracy/Precision/F1-score, and Recall.
Results:
EGA-Lasso substantially improves survival prediction, achieving an AUC of 0.91 compared to traditional methods. Using only basic radiomic features with conventional Lasso regression resulted in an AUC of 0.76, while incorporating DL-based radiomic features increased it to 0.84. Employing all features with traditional Lasso achieved an AUC of 0.86.
Conclusion:
Our method successfully identifies an optimal feature subset, yielding superior performance in head-and-neck cancer survival prediction. EGA-Lasso maintains lower model complexity and effectively mitigates over-fitting, suggesting broad applicability across diverse predictive modeling scenarios.

Back to List