Author: Sam Armstrong, Jamison Louis Brooks, Nicole Johnson, Douglas John Moseley, Cassie Sonnicksen, Erik J. Tryggestad 👨🔬
Affiliation: Mayo Clinic 🌍
Purpose: To evaluate the feasibility of a shallow learning-based quality assurance (QA) tool designed to assist human reviewers in assessing organ-at-risk (OAR) contours for head and neck radiotherapy.
Methods: Semi-automated approaches to OAR contour review can improve radiation treatment plan quality and reduce the time required for manual review. Previously, we developed a knowledge-based model capable of accurately identifying outlier contours using curated high-quality data and artificially generated errors. However, this model lacked an interface for clinical use and had not been tested with real-world clinical errors.
Results: Model performance in the prospective pilot closely mirrored results from retrospective testing with artificial errors for thirty-six OAR contour types. Sensitivity and specificity were 0.846 and 0.898, respectively, compared to 0.875 and 0.881 in the retrospective holdout set with artificial errors. The area under the curve for the primary decision model was 0.912 prospectively versus 0.936 in the retrospective holdout set using
Conclusion: The QA tool demonstrated robust performance for clinically relevant auto-segmentation errors, comparable to its performance with artificial errors. The review form effectively facilitated communication between the model and reviewers, streamlining contour evaluation and enabling automated collection of reviewer input for further analysis. This tool shows promise for integrating machine learning techniques into routine clinical workflows to improve contour review efficiency and accuracy.