Evaluating Commercial Auto-Segmentation Software: Is Performance on Pediatric Organs-at-Risk Accurate? 📝

Author: Gregory T. Armstrong, James E. Bates, Lei Dong, Ralph Ermoian, Jie Fu, Christine Hill-Kayser, Rebecca M. Howell, Sharareh Koufigar, John T. Lucas, Thomas E. Merchant, Tucker J. Netherton, Sogand Sadeghi 👨‍🔬

Affiliation: Department of Radiation Oncology, University of Washington and Fred Hutchinson Cancer Center, Department of Epidemiology and Cancer Control, St. Jude Children’s Research Hospital, St. Jude Children's Research Hospital, Department of Radiation Oncology, Fred Hutchinson Cancer Center, University of Washington, Department of Radiation Oncology, St. Jude Children’s Research Hospital, Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Department of Radiation Oncology, University of Washington/ Fred Hutchinson Cancer Center, Department of Radiation Oncology, University of Pennsylvania, University of Pennsylvania, Department of Radiation Oncology and Winship Cancer Institute, Emory University 🌍

Abstract:

Purpose: This study evaluates the adaptability and limitations of commercially available (MIM, RayStation) tools trained on predominately adult datasets (ages 20–60+ years) for delineating organs at risk (OARs) in pediatric radiotherapy patients.
Methods: Non-contrast CT scans from 353 pediatric patients (mean age: 7 years; range: 5 days to 16 years; 177 males and 176 females) were obtained from The Cancer Imaging Archive. This study focused on thoracic and abdominal/pelvic regions, with annotations provided for 27 organs at risk (OARs), including the bladder, kidneys, liver, heart, and lungs. Auto-segmented OARs generated using MIM and RayStation were compared to physician-approved manual contours. Segmentation quantitative accuracy was assessed using the Dice Similarity Coefficient (DSC) and Average Hausdorff Distance (AHD).
Results: Across all regions, MIM and Raystation achieved comparable mean segmentation accuracy (DSC 86.3% ± 5.3, AHD 1.68 ± 0.60 mm vs DSC 86.9% ± 5.0, AHD of 1.72 ± 0.52 mm respectively). MIM showed improved geometric precision in thoracic region (p = 0.043), while RayStation demonstrated superior performance in the abdominal/pelvic region (p = 0.024). In the thoracic region, MIM achieved a mean DSC of 90.4% ± 6.1 and an AHD of 1.79 ± 0.72 mm, while RayStation achieved a DSC of 87.6% ± 8.0 and an AHD of 1.70 ± 0.54 mm. In the abdominal/pelvic region, MIM had a DSC of 84.1% ± 3.4 and an AHD of 1.62 ± 0.56 mm, whereas RayStation had a DSC of 86.5% ± 2.4 and an AHD of 1.74 ± 0.54 mm.
Conclusion: Our findings demonstrate that existing commercial platforms trained on adult populations, can be adapted to delineate pediatric OARs in radiotherapy. However, they should be used with caution and needed to be refined in pediatric radiotherapy.

Back to List