Enhanced Prediction of Iroc Stereotactic Radiosurgery Phantom Audit Results with Treatment Parameters, Complexity Metrics, DVH, and Dosiomics Using Machine Learning 📝

Author: Lian Duan, Stephen F. Kry, Hunter S. Mehrens, Christine Peterson, Paige A. Taylor 👨‍🔬

Affiliation: The University of Texas MD Anderson Cancer Center, UT MD Anderson Cancer Center 🌍

Abstract:

Purpose: To develop predictive models for IROC SRS head phantom audits and to identify important factors influencing institutional performance.
Methods: The IROC SRS head phantom includes two TLDs and GAFchromic films to measure point and two-dimensional doses at the target. Between 2012 and 2022, a total of 898 irradiations were performed across 693 institutions using C-arm linear accelerators (Cyberknife and Gammaknife were excluded). Institution results were classified as failures when TLD-to-TPS dose differences exceeded 5% or gamma passing rates were below 85%. Model input features included 9 treatment parameters, 24 complexity metrics, 67 DVH parameters, and 100 dosiomics features. The dataset was divided into training and testing sets (4:1). Feature selection was conducted using Minimum Redundancy Maximum Relevance and Boruta methods. Random forest, XGBoost, k-Nearest Neighbors (kNN), and ensemble approaches, were developed to predict pass/fail outcomes, TLD ratios, and gamma values. Feature importance was identified using SHAP analysis.
Results: XGBoost outperformed random forest and kNN in classification, while the ensemble model achieved the highest performance with an AUC of 0.896, accuracy of 0.906, and sensitivity of 0.961. Plan-averaged beam modulation and mean tongue-and-groove index were identified as the most significant features in pass/fail classification. All models achieved mean absolute errors ranging from 2.5%-2.8% for TLD and 3.8%-4.0% for gamma. Key predictors for TLD included leaf travel, MLC speed modulation, and mean MLC speed, while field size was the most important feature for gamma. Additionally, three dosiomics features were identified as important in each model, as were D95% for TLD and D100% and V28Gy for gamma predictions.
Conclusion: Machine learning models effectively predict SRS phantom audit results using only pre-irradiation data. Complexity metrics significantly contributed to predicting phantom irradiation failures. Incorporating DVH and dosiomics further enhanced model performance. This approach could supplement physical phantom audits and thereby streamline clinical trial credentialing.

Back to List