Author: Wenfeng He, Tian Liu, Pretesh Patel, Richard L.J. Qiu, Keyur Shah, Tonghe Wang, Xiaofeng Yang, Chulong Zhang 👨🔬
Affiliation: Icahn School of Medicine at Mount Sinai, Emory University, Medical Physics Graduate Program, Duke Kunshan University, Memorial Sloan Kettering Cancer Center, Department of Radiation Oncology and Winship Cancer Institute, Emory University 🌍
Purpose: This study introduces a tracking-free approach to reconstruct 3D ultrasound (US) volumes from 2D freehand US scans. By eliminating the reliance on external tracking systems, this method aims to enhance 3D reconstruction accuracy, thereby improving clinical diagnosis, surgical planning, and intraoperative guidance.
Methods: A novel hybrid transformer–convolutional neural network (CNN) architecture was developed for 3D US volume reconstruction. This hybrid architecture leverages transformers' ability to capture long-range dependencies and global context in 2D US frames, alongside CNN’ strength in extracting local features and preserving fine details. The transformer component effectively interprets complex anatomical structures, while the CNN ensures precise texture and boundary representation. This integration enhances the accuracy and robustness of volume reconstruction from freehand US scans without tracking. This method was evaluated using 228 forearms US scans acquired from 19 volunteers. Data from 13 patients were randomly selected for training, 3 for validation, and 3 for testing. Performance was quantitatively evaluated using the Dice Similarity Coefficient (DSC) and drift rate metrics.
Results: On the test dataset, the proposed hybrid network approach achieved a mean DSC of 0.63 ± 0.20 and a drift rate of 15.63 ± 10.19%. Compared with a CNN-based approach, the proposed method improved DSC by 0.07, and reduced the drift rate by 7.95%. Relative to a recurrent neural network (RNN)-based approach, the DSC increased by 0.02, and the drift rate decreased by 4.82%.
Conclusion: This study presents a new tracking-free approach for reconstructing 3D US volumes from 2D freehand scans using a transformer-CNN network. The proposed method demonstrates superior accuracy and robustness over traditional techniques. By removing the need for external tracking devices, our method offers a practical, cost-effective solution for 3D US imaging across a wide range of clinical applications.