Author: Steve B. Jiang, Dan Nguyen, Chenyang Shen, Fan-Chi F. Su, Jiacheng Xie, Shunyu Yan, Daniel Yang, Ying Zhang, You Zhang ๐จโ๐ฌ
Affiliation: Medical Artificial Intelligence and Automation (MAIA) Lab, Department of Radiation Oncology, UT Southwestern Medical Center, UT Southwestern Medical Center, Medical Artificial Intelligence and Automation (MAIA) Lab & Department of Radiation Oncology, UT Southwestern Medical Center, Medical Artificial Intelligence and Automation (MAIA) Laboratory, Department of Radiation Oncology, UT Southwestern Medical Center, The University of Texas at Dallas ๐
Purpose: Accurate delineation of treatment targets and organs-at-risk is crucial for radiotherapy. Despite significant progress in artificial intelligence (AI)-based automatic segmentation tools, efficient and reliable quality assurance (QA) is still missing yet highly desired, particularly for time-sensitive online adaptive radiotherapy (oART). This study aims to develop an AI-driven automatic contour QA framework with integrated uncertainty quantification, offering a fast, robust, and clinically applicable solution.
Methods: Using MR-guided oART for prostate cancer as a testbed, we proposed a general contour QA framework employing a contour quality estimation (ConQuE) model to assess segmentation quality across multiple organs. ConQuE architecture was designed based on ResNet34 which takes the binary mask of a contoured organ and corresponding MR image as inputs to predict clinical acceptability categorized as either โRevision Requiredโ or โAcceptableโ. A structure code was integrated into the model at multiple fully connected layers to specify the organ under evaluation. To enhance robustness, Monte Carlo (MC) dropout was incorporated to quantify the prediction uncertainty, allowing end-users to observe and take actions accordingly. For proof-of-principle purposes, we trained ConQuE to evaluate the segmentation quality of urethra, femoral heads, and rectum in oART. Evaluation metrics included accuracy, area under the receiver operating characteristic curve (AUC), and predictive uncertainty calibration.
Results: The average computation time for contour QA with 20 independent forward passes for MC dropout was โผ12.9ms per slice. ConQuE achieved accuracy of 92.3%, 94.0%, and 94.3% for urethra, femoral heads, and rectum, respectively, with AUC consistently above 0.93. High uncertainty scores strongly correlated with wrong classification, illustrating the robustness and potential utility of the proposed framework in real clinical settings.
Conclusion: The proposed AI-driven framework can accurately assess contour quality while providing uncertainty estimation, effectively augmenting the efficiency and confidence of decision-making of clinicians in editing contours, especially with heavy time constraints in oART.