Combining Patch-Based CNN Models with Hierarchical Shapley Explanations for Breast Cancer Diagnosis

Author: Xuelian Chen, John Ginn, Zhuhong Li, Kaizhong Shi, Chunhao Wang, Jianliang Wang, Chuan Wu, Zhenyu Yang, Fang-Fang Yin, Jingtong Zhao 👨‍🔬

Affiliation: The First People's Hospital of Kunshan, Duke University, Medical Physics Graduate Program, Duke Kunshan University, Duke Kunshan University, Department of Radiation Oncology, Duke Kunshan University 🌍

Abstract:

Purpose: Developing deep learning-based models for accurate automated breast cancer diagnosis from mammography presents significant challenges due to the small size and subtle nature of breast lesions relative to the large dimensions of mammographic images. Furthermore, explaining the contribution of key image regions to classification decisions remains a critical difficulty. This study aimed to develop a CNN-based automated breast cancer diagnosis model and integrate a novel Hierarchical Shapley (h-Shap) method to effectively explain and visualize the critical image regions influencing classification outcomes.
Methods: The publicly available CBIS-DDSM mammography dataset, comprising 1131 patients with 1,355 abnormalities, was utilized. Each original image was resized to 1500×1000 pixels and uniformly segmented into smaller patches for analysis. These patches were input into a CNN-based EfficientNet-B0 deep learning model to classify whether each patch contained an abnormality. The dataset was split into training and testing sets with an 8:2 ratio. The h-Shap method was employed to calculate the contribution of individual image regions to classification decisions, enabling hierarchical decomposition of feature importance. Heatmaps were generated to visually align the model’s predictions with clinically relevant features.
Results: The proposed model achieved an overall classification accuracy of 83.43% on the test set. The confusion matrix revealed a TPR=29.62% and a TNR=90.16%. The model faced challenges in correctly identifying true positives, highlighting the complexity of detecting subtle abnormalities. Notably, 35.20% of correctly classified positive samples had tumor regions accurately identified and highlighted through h-Shap visualizations, underscoring the method’s effectiveness in enhancing model explainability.
Conclusion: This study presents an integrated approach that combines the EfficientNet-B0 CNN model for mammography-based breast cancer diagnosis with the h-Shap method for improved explainability.

Combining Patch-Based CNN Models with Hierarchical Shapley Explanations for Breast Cancer Diagnosis 📝

Abstract: