Performance Evaluation of CT-Based Lung Tumor Classification Deep Learning Algorithms Under Centralized and Federated Learning Frameworks 📝

Author: Yifei Hao, Chengliang Jin, Wenxuan Li, Bing Luo, Tao Peng, Yulu Wu, Fang-Fang Yin, Yue Yuan, Lei Zhang, Ruojun Zhou 👨‍🔬

Affiliation: School of Future Science and Engineering, Soochow University, Electrical and Computer Engineering Graduate Program, Duke Kunshan University, Medical Physics Graduate Program, Duke Kunshan University 🌍

Abstract:

Purpose: Federated learning is a patient privacy-protecting technique that has recently been applied in the medical field. This study aims to evaluate the performance of several deep learning networks under centralized and federated learning (FL) frameworks for CT-based lung tumor type classification.

Methods: The public dorsar/lung-cancer dataset including 708 lung CT images (223 adenocarcinomas, 163 large cell carcinomas, 122 normal images, and 200 squamous cell carcinomas) was used. Four deep learning networks (ConvNeXT, Swin Transformer, GoogLeNet, and ResNet) were studied for multi-class tumor classification. Both centralized and FL frameworks were evaluated, with data split of 7:1:2 for training, validation and testing. The FL was performed with differential privacy to protect data privacy. The local trained parameters were clipped and added with random Gaussian noise before being transferred to the server. The collected parameters were aggregated and then distributed to the clients from the server for the next iteration. Accuracy, sensitivity, specificity and AUC were evaluated and compared between the 4 networks, and between the centralized and FL frameworks.

Results: Under centralized framework, the four networks showed accuracy ranging from 0.9718 to 0.9789, with highest sensitivity (1.0) for adenocarcinoma and highest specificity (1.0) for large cell carcinoma and normal case diagnoses. The AUC were all above 0.99. Under FL framework, the four networks showed accuracy of 0.9653 (ConvNeXT), 0.9583 (Swin Transformer), 0.9718 (GoogLeNet), and 0.9390 (ResNet), which were slightly lower and more variable than those under centralized framework. The AUC were mostly above 0.99, with exception of Swin Transformer in the 3 types of cancer diagnoses.

Conclusion: Four deep learning networks were evaluated in the performance of CT-based lung tumor multi-class classification tasks under both centralized and federated learning frameworks. Federated learning showed competitive classification performances while preserving patient data privacy, with higher variability in the performances for certain models.

Back to List