Author: Mingli Chen, Xuejun Gu, Hao Jiang, Mahdieh Kazemimoghadam, Weiguo Lu, Qingying Wang, Kangning Zhang π¨βπ¬
Affiliation: Medical Artificial Intelligence and Automation (MAIA) Lab, Department of Radiation Oncology, UT Southwestern Medical Center, UT Southwestern Medical Center, Department of Radiation Oncology, Stanford University School of Medicine π
Purpose:
Deep learning-based automatic medical image segmentation is increasingly employed in clinical practice, significantly reducing manual workload. However, verifying segmentation results remains essential despite advancements in segmentation accuracy, and the growing reliance on automatic segmentation has added to this burden. To address this challenge, we developed nnAE, an auto-encoder based on the nnU-Net framework, for automatic anomaly detection and quality assurance of medical image segmentation.
Methods:
The nnAE was adapted from nnU-Net by replacing residual connections with zero-matrices and utilizing bottleneck features to reconstruct input segmentation masks, with one output channel per mask. To directly utilize the existing segmentation framework of nnU-Net, the input masks were consolidated into a single channel. Trained on normal segmentation masks, nnAE is designed to reconstruct them, with significantly larger reconstruction errors expected for anomalous input masks. We used AI-generated vertebral body (VB) segmentation masks from 727 CT scans, containing a variable number of VBs per scan and totaling 8,551 VBs, to train and test the nnAE (using a 5:1 training-to-testing ratio). Anomalous masks were synthesized from the test dataset, including incorrect or missing labels and inaccurate contours simulated through mask dilation or erosion, to demonstrate the nnAEβs effectiveness in detecting anomalies.
Results:
The reconstruction errors showed negligible differences between training and testing groups, with effect sizes of 0.07, 0.02, and 0.15 for the Dice similarity coefficient (DSC), Hausdorff distance (HD95), and average surface distance (ASD), respectively. This indicates strong generalizability of nnAE. The synthesized anomalous masks resulted in significantly higher reconstruction errors, with HD95 being the most sensitive metric across all types of simulated anomalies.
Conclusion:
A simple conversion from nnU-Net to nnAE is demonstrated, leveraging the advantage of self-configurability of nnU-Net for the contour QA application. The system can be used to generate alerts for segmentation that may require manual review and intervention.