Discover New Clinical Insights and Genomic Test Results from Breast Cancer Patients’ Records
Valuable Insights Trapped in Unstructured Data
Electronic health records (EHRs) and related clinical documents are valuable sources of intelligence on patients’ health journeys including data on therapies, treatments, and outcomes as well as diagnostic tests. But extracting it remains challenging for clinicians and scientists since most of it exists as unstructured text.
To address this issue, scientists and clinicians from leading healthcare systems in the US and the UK applied Pangaea Data’s breakthrough artificial intelligence-based methods to extract and summarize such intelligence in a privacy preserving manner from EHRs across multiple disease areas including cancer.
Today, many cancer patients undergo tumor profiling as part of their clinical care. These tests are paving the way for more personalized therapies based on improved tumor characterization. Test results are typically provided in reports that are attached to health records along with other data such as doctors notes.
Together these documents provide important context about the patient’s health journey. They are rich sources of intelligence that reflect the impact of genome test results on treatment decisions and health outcomes. This information can benefit clinical research focused on the heterogeneity of different tumor subtypes. It can also help expand scientists’ understanding of cancer genomics, leading to more realistic disease models and effective treatments. These are just some of the potential benefits that extracting high quality genomic data from health records can provide.
PIES- Intelligence Extraction from Unstructured Data
Pangaea’s Intelligence Extraction and Summarization (PIES) software was designed to make pulling that data possible without compromising patients’ privacy. It uses novel unsupervised AI methods relating to natural language processing and natural language generation to extract and summarize actionable insights from unstructured text. Additionally, PIES automated machine learning framework combines these insights with structured data to determine the best subset and permutation of features that is useful for stratifying target patient populations.
To demonstrate what PIES can do, Pangaea designed a project with physicians and scientists from the University of Illinois Cancer Center and the University of Washington. The team looked at the health records of an ethnically and racially diverse pool of breast cancer patients including notes from their doctors and reports from genomic tests.
The results were impressive. The models extracted 26 features related to breast cancer from patient records with 97 percent accuracy. Fourteen of those features were extracted with 100% accuracy. The list of features included demographics, cancer stage, grade, scores, treatments, and genetic test results of specific cancer biomarkers.
PIES affirmed existing relationships between clinical (phenotypic) and genomic features extracted from the records for critically ill breast cancer patients and also discovered new relationships for characterizing such patients. This has value for physicians in intensive care units because it allows them to apply such knowledge to understand patient trajectories and allocate resources more effectively. It also sheds light on the genomic and phenotypic underpinnings of critical breast cancer cases. All of these results were validated by clinical experts who can confirm the quality of the work and software.
Also, Pangaea has published several papers and case studies that document the value of PIES in clinical and research contexts for characterizing hard to diagnose conditions, finding undiagnosed and miscoded patients, summarizing patient records, and automatically generating regulatory reports. Learn more here.