Medical Scientific Table-to-Text Generation with Human-in-the-Loop under the Data Sparsity Constraint
Structured tabular data in the pre-clinical and clinical domains contains valuable information about individuals and an efficient table-to-text summarization system can drastically reduce manual efforts to condense this data into regulatory reports in the biopharmaceutical industry. We introduce Pangaea’s Intelligence Extraction and Summarization (PIES), a neural architecture, which solves a challenging task of automatically generating textual narratives for regulatory grade reports in the medical and scientific domain for the first time. PIES uses a unique data augmentation procedure to address data sparsity (training with only 25 examples). Coupled with the copy mechanism, PIES ensures model interpretability and precision of values, which appear in the output (textual narratives). PIES is also generalizable for various input datasets and the study shows that PIES selects salient biomedical entities and values from structured data with improved precision (up to 93%) of copying the tabular values to generate coherent and accurate text for assay validation reports and toxicology reports. Moreover, PIES has also shown to drastically reduce the human effort normally required to improve the performance of such models through its unique Clinician-in-the-Loop protocol.
Read the full paper
To download the full research paper please fill in the form below.