• Media type: E-Article
  • Title: Automated abstraction of real-world clinical outcome in lung cancer: A natural language processing and artificial intelligence approach from electronic health records
  • Contributor: Ma, Meng; Redfern, Arielle; Zhou, Xiang; Li, Dan; Ru, Ying; Lee, Kyeryoung; Gilman, Christopher; Liu, Zongzhi; Jones, Scott; Mai, Yun; Deitz, Matthew; Gong, Yunrou; Mullaney, Tommy; Prentice, Tony; Chen, Rong; Schadt, Eric; Wang, Xiaoyan
  • imprint: American Society of Clinical Oncology (ASCO), 2020
  • Published in: Journal of Clinical Oncology
  • Language: English
  • DOI: 10.1200/jco.2020.38.15_suppl.e14062
  • ISSN: 0732-183X; 1527-7755
  • Keywords: Cancer Research ; Oncology
  • Origination:
  • Footnote:
  • Description: <jats:p> e14062 </jats:p><jats:p> Background: Real world evidence generated from electronic health records (EHRs) is playing an increasing role in health care decisions. It has been recognized as an essential element to assess cancer outcomes in real-world settings. Automatically abstracting outcomes from notes is becoming a fundamental challenge in medical informatics. In this study, we aim to develop a system to automatically abstract outcomes (Progression, Response, Stable Disease) from notes in lung cancer. Methods: A lung cancer cohort (n = 5,003) was obtained from the Mount Sinai Data Warehouse. The progress, pathology and radiology notes of patients were used. We integrated various techniques of Natural Language Processing (NLP) and Artificial Intelligence (AI) and developed a system to automatically abstract outcomes. The corresponding images, biopsies and lines of treatments (LOTs) were abstracted as attributes of outcomes. This system includes four information models: 1. Customized NLP annotator model: preprocessor, section detector, sentence splitter, named entity recognition, relation detector; CRF and LSTM methods were applied to recognize entities and relations. 2. Clinical Outcome container model: biopsy evidence extractor, lines of treatment detector, image evidence extractor, clinical outcome event recognizer, date detector, and temporal reasoning; Domain-specific rules were crafted to automatically infer outcomes. 3. Document Summarizer; 4. Longitudinal Outcome Summarizer. Results: To evaluate the outcomes abstracted, we curated a subset (n = 792) from patient cohort for which LOTs were available. About 61% of the outcomes identified were supported by radiologic images (time window = ±14 days) or biopsy pathology results (time window = ±100 days). In 91% (720/792) of patients, Progression was abstracted within a time window of 90 days prior to first-line treatment. Also, 72% of the Progression events identified were accompanied by a downstream event (e.g., treatment change or death). We randomly selected 250 outcomes for manual curation, and 197 outcomes were assessed to be correct (precision = 79%). Moreover, our automated abstraction system improved human abstractor efficiency to curate outcomes, reducing curation time per patient by 90%. Conclusions: We have demonstrated the feasibility and effectiveness of NLP and AI approaches to abstract outcomes from lung cancer EHR data. It promises to automatically abstract outcomes and other clinical entities from notes across all cancers. </jats:p>
  • Access State: Open Access