Simultaneous Synthesis of Lung Perfusion and Ventilation Images from CT Using a Dual-Decoder Residual Attention Network for Lung Disease Diagnosis

Author: Li-Sheng Geng, David Huang, Haoze Li, Xi Liu, Meng Wang, Tianyu Xiong, Ruijie Yang, Weifang Zhang, Meixin Zhao 👨‍🔬

Affiliation: School of Physics, Beihang University, Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Department of Radiation Oncology, Peking University Third Hospital, Department of Nuclear Medicine, Peking University Third Hospital, Medical Physics Graduate Program, Duke Kunshan University 🌍

Abstract:

Purpose: This study aimed to develop a deep learning-based framework for simultaneously generating lung perfusion and ventilation images from three-dimensional computed tomography (3D CT) images.
Methods: A total of 98 cases that underwent single-photon emission CT perfusion images (SPECT PI) with 99mTc-labeled macroaggregated albumin, ventilation images (VI) with 99mTc-Technegas, and 3D CT images were collected. The 3D CT and SPECT images were registered and cropped to include only the lung. A 3D dual-decoder residual attention network (DDRAN) was constructed to generate both DL-based PI and DL-based VI simultaneously from 3D CT images. For comparative assessment, we additionally employed a conventional single-decoder residual attention network (RAN) to individually generate PI and VI. The structural similarity index (SSIM) and Spearman's rank correlation coefficient (Rs) were utilized to assess voxel-wise agreement. In contrast, the Dice similarity coefficient (DSC) was applied to evaluate function-wise concordance. We used the Wilcoxon signed-rank test to statistically evaluate the differences between the images synthesized by DDRAN and RAN.
Results: The average SSIM values of the DDRAN/RAN model are 0.871/0.866 (p<0.05) for PI and 0.830/0.825 (p<0.05) for VI, and the Rs values are 0.836/0.819 and 0.732/0.731, respectively. The DDRAN/RAN model achieved average DSC values of 0.795/0.797 for PI and 0.708 /0.718 for VI in low functional regions, and 0.857/0.849 for PI and 0.794/0.793 for VI in high functional regions.
Conclusion: We developed a dual-decoder residual attention network simultaneously synthesizing lung perfusion and ventilation images from 3D CT images. The preliminary results demonstrated moderate-to-high structural-wise and functional-wise concordances, and our proposed model demonstrated comparable accuracy when benchmarked against single-decoder models. The synthesized perfusion and ventilation images could potentially be used for diagnosing lung diseases and guiding the design of functional lung avoidance radiotherapy treatment plans.

Simultaneous Synthesis of Lung Perfusion and Ventilation Images from CT Using a Dual-Decoder Residual Attention Network for Lung Disease Diagnosis 📝

Abstract: