Improving Mammography Diagnosis Accuracy through Global Context and Local Lesion Integration

Author: Minbin Chen, Xiaoyi Dai, Xiaoyu Duan, Chunhao Wang, Fan Xia, Zhenyu Yang, Fang-Fang Yin, Chulong Zhang, Rihui Zhang 👨‍🔬

Affiliation: Duke University, Duke Kunshan University, Medical Physics Graduate Program, Duke Kunshan University, The First People's Hospital of Kunshan 🌍

Abstract:

Purpose: Deep learning (DL)-based mammography diagnosis presents unique challenges, as accurate interpretation requires both global breast condition analysis and local lesion structural information. Existing DL models often focus on either global or local aspects, leading to suboptimal diagnostic performance. This study aimed to develop a novel global-local multi-view mammography diagnostic model that integrates the Swin-transformer’s long-range modeling capabilities with wavelet convolutional neural networks (CNNs) to effectively capture and preserve both global structures and local detailed features in mammographic images.

Methods: We designed a global-local multi-view mammography diagnostic model by combining the Swin-transformer and wavelet CNNs. The Swin-transformer provided a global perspective by capturing long-range dependencies and hierarchical features across the entire mammogram. To address the need for local lesion analysis, a weakly supervised learning scheme was employed to identify regions of interest (ROIs). Class Activation Maps were used to localize suspicious regions, and wavelet CNNs were applied to extract detailed features from these ROIs. An attention mechanism was subsequently used to fuse global and local features, producing the final diagnostic prediction.
The model was evaluated using two large publicly available mammography datasets: Vindr-Mammo and CBIS-DDSM. The datasets were divided into training, validation, and testing sets with a ratio of 7:1:2. Comparative analyses were conducted to benchmark the proposed method against classic ResNet-18 and Swin-transformer models using AUC, ACC, and F1-score metrics.

Results: The proposed model demonstrated superior diagnostic performance with AUC=0.815/0.803, ACC=0.933/0.703, and F1-score=0.921/0.704 on the Vindr-Mammo and CBIS-DDSM dataset, respectively. It outperformed ResNet-18 (Vindr-Mammo: AUC/ACC/F1= 0.727/0.783/0.619; CBIS-DDSM: AUC/ACC/F1= 0.719/0.591/0.558) and Swin-transformer (Vindr-Mammo: AUC/ACC/F1= 0.731/0.651/0.594; CBIS-DDSM: AUC/ACC/F1= 0.724/0.601/0.599) models on both datasets.

Conclusion: The proposed global-local multi-view mammography diagnostic model provides a robust and effective approach for improving breast cancer diagnostic accuracy by integrating global and local feature analysis. The results demonstrate its potential to enhance clinical decision-making in mammography.

Improving Mammography Diagnosis Accuracy through Global Context and Local Lesion Integration 📝

Abstract: