Author: Hengjie Liu, Dan Ruan, Ke Sheng, DI Xu 👨🔬
Affiliation: Physics and Biology in Medicine, University of California, Los Angeles, Department of Radiation Oncology, University of California, San Francisco, Department of Radiation Oncology, University of California at San Francisco, Department of Radiation Oncology, University of California, Los Angeles 🌍
Purpose:
State-of-the-art deep learning-based deformable image registration often uses large, complex models directly adapted from computer vision tasks but achieves only comparable performance to conventional optimization-based methods. Recent studies have highlighted the importance of registration-specific designs over architectural improvements. In this work, we present DIRECT (Deep Image Registration with Efficient ComputaTion), a lightweight model that delivers competitive registration performance. DIRECT offers several key advantages, including faster training and inference, lower energy consumption, and improved compatibility with memory-constrained devices.
Methods:
The DIRECT network comprises feature extraction and deformation prediction modules, both optimized for computational efficiency through registration-specific designs. Key components include dual-branch feature extraction, multiresolution pyramids, and low-to-high frequency refinement. We explored a deterministic discrete wavelet transform (DWT) and a lightweight, trainable CNN-based encoder for feature extraction. Deformation prediction used a multiresolution pyramid for coarse-to-fine deformation estimation. Additionally, we incorporated low-to-high frequency refinement to enhance performance. We experimented with three variants of DIRECT: DWT, DWT&lo-hi and CNN&lo-hi. Experiments were conducted on two MRI brain registration datasets, IXI and LUMIR, and results were benchmarked against state-of-the-art methods, including VoxelMorph, TransMorph, Dual-PR-Net, and VFA.
Results:
DIRECT demonstrated superior model efficiency, with parameters/Mult-Adds of 26K/25G (DIRECT DWT), and 177K/126G (DIRECT CNN&lo-hi) compared to 46.8M/658G (TransMorph), 492K/245G (Dual-PR-Net). On the IXI dataset, DIRECT (DWT) achieved a Dice score of 0.760 ± 0.123, outperforming TransMorph (0.753 ± 0.124) and Dual-PR-Net (0.756 ± 0.132), with improved deformation regularity. On the LUMIR dataset, DIRECT (CNN&lo-hi) delivered competitive accuracy (Dice: 0.7614 ± 0.0323 vs. 0.7594 ± 0.0319 for TransMorph and 0.7635 ± 0.0303 for Dual-PR-Net; TRE: 2.48 mm vs. 2.42 mm and 2.50 mm).
Conclusion:
DIRECT significantly reduced computational complexity while achieving competitive deformable image registration performance. Its efficiency and lower resource demand make it well-suited for point-of-care deployment on memory-constrained devices and energy-efficient applications.