Author: David L. Barbee, David Byun, Matt Long, Jose R. Teruel Antolin, Michael J Zelefsky 👨🔬
Affiliation: NYU Langone Health 🌍
Purpose:
Online adaptive MR-Linac therapy requires contour adaptation, often adding 20 minutes to treatment time and reducing machine throughput. This study introduces a fully automated MR contour prediction model using nnUNet and in-house data.
Methods:
A U-Net segmentation model was developed using 650 manually contoured T2 axial 2D prostate images (125 patients) from online adaptive and MR simulation scans. Structures included prostate (+seminal vesicles), urethra, penile bulb, femurs, bones, bladder, and rectum. Data handling utilized PyDicer, and training employed nnUNetv2 with 5-fold cross-validation on A100 GPUs. Inference was conducted on an RTX2000 workstation integrated with an automated DICOM pipeline to process images and send predictions to MIM. The model was tested on 27 unseen online adaptive patient scans, varying in slice number (67–125), thickness (2.0–3.0 mm), site (prostate-only, +seminal vesicles, lymph nodes, bed, nodes-only), voxel dimensions, and scan technique (2D vs. 3D compressed sense). Predicted structures were compared with clinical contours using overlap, distance, surface, and volume metrics.
Results:
Training required ~8 hours per fold on A100 GPUs, while inference averaged 3:40 minutes per scan on the RTX2000 workstation. Mean Dice scores exceeded 0.85 for bladder (0.92), rectum (0.89), and femurs (0.91 & 0.88), with rectum and femur variability attributed to inconsistent sup/inf extent in clinical contours. Prostate Dice averaged 0.80, influenced by node or bed treatments and inclusion of seminal vesicles. Penile bulb and urethra resulted in lower Dice scores, requiring some modifications. Despite training on 2D images with 3 mm slices, the model performed better on 3D images with 2 mm slices.
Conclusion:
Implementation of a fully automated in-house MR AI segmentation model reduces overall contouring time and was robust to variations in input image parameters. Clinical implementation has improved workflow efficiency and reduced treatment duration. Further code and hardware optimization could reduce inference time by 75%.