Author: Weigang Hu 👨🔬
Affiliation: Fudan University Shanghai Cancer Center 🌍
Purpose: The purpose of this study is to introduce a VQVAE-based framework that addresses the limitations of conventional dose prediction methods, which rely on fixed deep learning models that produce deterministic dose outputs, ignoring the variability inherent in clinical radiotherapy planning.
Methods: The proposed framework employs a VQ-VAE network for generating 3D dose distributions and CNN-based style extractors with embedded KL divergence for stochastic variability. The model utilizes CT images and structure masks as inputs, with the VQ encoder quantizing features into a codebook and the decoder reconstructing doses. Style extractors process inputs including CT, masks, and planning doses from multiple replans, with KL divergence aligning latent feature distributions to capture dose variance. Random Gaussian-noise perturbations applied to style features enable stochastic and diverse dose predictions. The data set, comprising 70 nasopharyngeal cancer patients with five VMAT replans, was organized into binary channels for 2–4 PTVs and 8 OARs. Losses included perceptual, MSE, and KL divergence, ensuring accurate, diverse, and clinically relevant predictions.
Results: The results obtained demonstrated that the model achieved high accuracy, with a mean absolute error (MAE) of 0.106 Gy between prediction and ground truth, and clear ray path characteristics. Dose-volume histogram and dosimetric metrics comparisons demonstrated close agreement with ground truth for PTVs and clinically acceptable deviations for normal tissues. Repeated predictions for the same case exhibited stochastic results with a MAE of 0.103±0.012 Gy between prediction and ground truth, demonstrating the model's ability to generate diverse yet clinically consistent outputs.
Conclusion: The VQVAE-based framework reliably predicts 3D dose distributions with high accuracy, sharp edge reconstruction, and inherent variability.