Author: Hajar Moradmand, Lei Ren 👨🔬
Affiliation: University of Maryland School of Medicine, University of Maryland 🌍
Purpose:
The Sharp-van der Heijde (SvH) score is essential for assessing joint damage in rheumatoid arthritis (RA) from radiographic images. However, manual scoring is time-intensive and prone to variability. This study presents a novel multistage deep learning framework for automated Overall Sharp Score (OSS) prediction from hand X-rays, utilizing a Vision Transformer (ViT) to improve accuracy and generalizability.
Methods:
The framework consists of four stages:
Image preprocessing: All images are resized, normalized, and re-oriented (90-degree orientation).
Hand segmentation: UNet with EfficientNet-B0 backbone is used to segment hand regions.
Joint Identification: YOLOv7 (You Only Look Once) localizes key joints (e.g., MCP, PIP, and wrist).
OSS prediction: ViT predicts the OSS using advanced self-attention features.
A dataset of 970 patients covering all RA stages was used. Stratified 3-fold cross-validation was conducted, with 679 patients allocated for training and 291 reserved for external validation. Evaluation metrics included Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), Huber loss, and Intraclass Correlation Coefficient (ICC) for OSS prediction. Padding and masking techniques align joint images to address variability in joint visibility.
Results:
The joint identification model achieved 99% accuracy. The ViT model excelled in OSS prediction, particularly for patients with scores below 50, with a Huber loss of 4.9, RMSE of 9.73, and MAE of 5.35. It showed a high agreement with expert scores (ICC = 0.702, P < 0.001). Furthermore, this approach addresses the issue of missing joints and image variability due to disease progression and acquisition differences.
Conclusion:
This study is the first to employ a Vision Transformer for OSS prediction in RA, providing an automated, efficient, and robust alternative to manual scoring. This system is a valuable tool for early detection, monitoring, and treatment planning for RA, particularly in resource-limited settings where access to expert radiologists is scarce.