Author: Melissa Bronson, Elizabeth L. Covington, Robert T. Dess, Joseph R. Evans, William C. Jackson, Charles S. Mayo, Michelle L. Mierzwa, Benjamin S. Rosen, VG Vinod Vydiswaran, Grant Weyburne, Zheng Zhang, Henry Zocher ๐จโ๐ฌ
Affiliation: PCORnetยฎ, The National Patient-Centered Clinical Research Network, University of Michigan ๐
Purpose: Diagnosis and staging are an integral part of cancer care, but this information is often scattered across various electronic medical records. The fragmentation increases overall documentation burden and risks of transcription errors. In this study, we assess the feasibility of extracting diagnosis and staging details from clinical notes at-scale to create a single, unified source of information, by applying semantic search methods to a real-world data warehouse (RWDW).
Methods: We extracted encounter and treatment records dated between 2013 and 2024 for prostate cancer (PCa) patients from the RWDW. This data infrastructure sources clinical notes from the hospitalโs electronic health records reporting database (Clarity, Epic Systems Corporation) and links them with course records in the treatment management system (ARIA, Varian Medical Systems). We included notes beginning with "Diagnosis:" and courses marked with the International Classification of Diseases (ICD) code for primary PCa (ICD-9 185 or ICD-10 C61). Patients who received consultation but no treatment were excluded. A total of 848 records were identified. We used pre-trained DeBERTa-variant models to perform semantic search from the first 200 characters of the notes (referred to as "proximal notes") and measure the concordance of the extracted information with the diagnosis & staging details stored in the treatment courses. Our study focused on T-stage, as it is more clinically meaningful for prostate staging, compared with N-stage or M-stage.
Results: Our pipeline resulted in a macro-averaged sensitivity of 0.996 and a macro-averaged specificity of 0.997 for extracting T-stage.
Conclusion: Semantic search from RWDW shows great promise in extracting diagnosis and staging at scale. Future work is needed to assess the influence of data comprehensiveness and biases.