Author: Yunfei Dong, Dongyang Guo, Jiongli Pan, Tao Peng, Caiyin Tang, Zhenyu Yang, Fang-Fang Yin, Lei Zhang, Tianyi Zhang, Yaogong Zhang π¨βπ¬
Affiliation: Duke Kunshan University, Department of Radiology, Taizhou Peopleβs Hospital Affiliated to Nanjing Medical University, School of Future Science and Engineering, Soochow University, Medical Physics Graduate Program, Duke Kunshan University π
Purpose: This study aims to improve the accuracy of CT-based diagnosis of thyroid cancer by developing a hybrid model that integrates Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) networks using multi-mechanism fusion strategies. The innovation is to enhance diagnostic accuracy by leveraging both spatial and sequential features extracted from CT images.
Methods: A cohort of eighty-five patients with thyroid tumors (58 malignant and 27 benign) who underwent computed tomography (CT) scans was analyzed. The proposed approach involved transforming 3D CT images into 2.5D data to capture spatial relationships between images in the z-direction. A baseline CNN model, comprising five convolutional layers and two fully connected layers, was developed for tumor classification. CNN-LSTM hybrid models were constructed, each employing different mechanisms such as LastOutput, Attention, MaxPooling, and AveragePooling in the LSTM module. A final fusion model integrating multiple mechanisms was implemented to consolidate the strengths of individual models.
Results: The standalone CNN model achieved an accuracy of 82.71%. Incorporating LSTM networks with various mechanisms showed enhanced performances: the CNN-LSTM with LastOutput reached an accuracy of 92.11%, while both the Attention and MaxPooling mechanisms attained an accuracy of 91.49%. The AveragePooling mechanism yielded an accuracy of 89.36%. The most notable improvement was observed in the multi-mechanism fusion model, which achieved the highest accuracy of 93.62%.
Conclusion: The integration of CNN and LSTM networks through multi-mechanism fusion strategies substantially enhanced the accuracy of CT-based thyroid cancer diagnosis in a small size clinical dataset. This fusion approach effectively captures both intrinsic image features and spatial relationships between images, outperforming traditional single-network models. The findings highlight the potential of special strategies in improving diagnostic accuracy of deep learning based medical imaging applications.