Author: Ming Dong, Carri K. Glide-Hurst, Qisheng He, Anudeep Kumar, Alex Singleton Kuo, Joshua Pan, Chase Ruff, Nicholas R. Summerfield π¨βπ¬
Affiliation: Department of Computer Science, Wayne State University, Departments of Human Oncology and Medical Physics, University of Wisconsin-Madison, Department of Human Oncology, University of Wisconsin-Madison π
Purpose: Recent evidence highlights the importance of incorporating cardiac substructures (CS) into treatment planning for thoracic cancers, however current segmentation methods are limited to a single image modality and require separate models to decouple overlapping structures. We propose a Modality AGnostic Image Cascade (MAGIC) pipeline that delivers a single, lightweight model for rapid CS segmentation across any treatment planning modality, leveraging multi-modality, feature-rich datasets and preserving overlapping volumes to further enhance segmentation accuracy.
Methods: Multi-modality datasets including MR-Linac (n=62), simulation CT (sim-CT, n=83), and cardiac CT angiography (CCTA, n=56) were used to train, validate, and test MAGIC. All images were labeled with 13 CS including whole heart (group 1), chambers and great vessels (group 2), and coronary arteries (group 3). Modality-specific image encoders with single, shared decoders and prediction encoders were implemented to enable cross-modality learning and render modality-agnostic predictions for any given image input. Separate segmentation models target each group in a novel cascaded structure to facilitate multi-structure segmentation. For comparisons, equivalent Unimodal models for each imaging modality and structural groups were trained. All 9 models were compared to MAGIC using Dice Similarity Coefficient (DSC) and 95% Hausdorff distance (HD95) via paired T-Tests (p<0.05) and qualitative assessment.
Results: MAGIC performed with an average DSC/HD95 of 0.70Β±0.28/8.27Β±5.33mm respectively for MR-Linac, 0.71Β±0.21/8.38Β±3.65mm respectively for sim-CT, and 0.86Β±0.10/5.48Β±4.70mm respectively for CCTA images. MAGIC significantly improves against Unimodal, outperforming 34/39 DSC and 31/39 HD95 comparisons. MAGICβs largest DSC improvement compared to Unimodal was 0.16 (p<0.05) while MAGICβs largest underperformance was only 0.01 (p>0.05). Qualitative review demonstrated MAGIC with improved visual consistency of overlapping structures and fewer false predictions than Unimodal.
Conclusion: MAGIC coupled with multi-modality inputs outperformed Unimodal counterparts for the majority of CS. MAGIC remains generalized, enabling concurrent segmentation on different imaging modalities while handling overlap in a single, lightweight model.