Author: Indrin J. Chetty, Jing Cui, Mitchell Kamrava, Tiffany M. Phillips, Jennifer M. Steers, Brad Stiehl 👨🔬
Affiliation: Department of Radiation Oncology,Cedars-Sinai Medical Center, Cedars-Sinai Medical Center 🌍
Purpose: Auto-contouring for HDR interstitial brachytherapy can be confounded by large deformation in anatomy and image quality. Here we evaluated the performance of an AI-based auto-contouring software for generation of auto-contours of normal tissues, and the resulting dosimetric effects.
Methods: We retrospectively analyzed 10 cervical interstitial HDR cases. Prescription dose was 7 Gy/fraction for 4 fractions delivered BID with one treatment plan applied to all fractions. Physician-drawn contours for the bladder and rectum (gold-standard) were compared with AI-generated contours using Dice Similarity Coefficient (DSC), Hausdorff Distance (HD), and Average Surface Distance (ASD). Dosimetric differences were assessed using D2cc (the minimum dose received by the hottest 2cc of OAR).
Results: High-risk CTVs (HR-CTVs) were on average 52.8 cc (23.9-137.7 cc). DSC values (mean ± SD) were 0.93±0.0 for the bladder and 0.67±0.1 for the rectum. HD values were 11.5±2.3 mm for the bladder and 31.3±16.8 mm for the rectum. ASD values were 5.1±1.1 mm and 4.0±0.8 mm for the bladder and rectum, respectively. Differences in D2cc were 0.51±0.46 Gy/fx for the bladder and 0.45±0.44 Gy/fx for the rectum. AI-generated rectum contours were systematically segmented inferiorly versus gold-standard contours, which ended at S3 per standard atlases. Additionally, AI categorized the superior rectum as part of the sigmoid colon. While AI-based bowel loops were trained for segmentation with this software, bowel bag contours were not, and will be a focus of future work.
Conclusion: AI-generated bladder contours generally aligned well with physician contours, but significant discrepancies were sometimes noted, as observed in the relatively high rectum HD index values. Further training of AI models for GYN HDR is needed, including bowel bag-specific contours. This work highlights the importance of careful training of AI-based algorithms incorporating context-specific and multi-modal datasets such that the predictions can be robust to gold-standard contour variability and image quality.