Identifying COVID-19 Patients from Unstructured Notes: Performance of a Commercial Clinical Named Entity Recognition System
Author(s)
Kumar V, Rasouliyan L, Long S, Rao MB
OMNY Health, Atlanta, GA, USA
OBJECTIVES: The effectiveness of commercial clinical named entity recognition (NER) systems in recognizing COVID-19-related concepts is not well understood. The objective of this study was to measure the performance of the Amazon Comprehend Medical ICD-10-CM Ontology Linking API (ACM-ICD10) for detecting the presence of COVID-19-related concepts in a clinical document sample obtained from COVID-19 positive patients. METHODS: We obtained all clinical notes of patients who had tested positive for COVID-19 at a large medical institution in the OMNY Health System Database during one month in late 2020. Patients were excluded if the U07.1 ICD-10-CM code was not present in their structured diagnosis data. We processed the notes using ACM-ICD10 and obtained the ICD-10 concept mappings for each note. RESULTS: After exclusions, 63 patients remained for analysis. These 63 patients had 791 diagnoses in their structured diagnosis data. Processing the 1,488 notes of the 63 patients through ACM-ICD10 yielded 81,599 ICD-10 concept mappings, none of which contained the U07.1 code. However, mappings were observed that related to symptoms and alternative diagnoses of COVID-19 including “dyspnea, unspecified” (R06.00; 518 times; 0.6%), “shortness of breath” (R06.02; 505 times; 0.6%), “unspecified infectious disease” (B99.9; 466 times; 0.6%), and “viral infection, unspecified” (B34.9; 292 times; 0.4%). Specifically, text containing the “covid” substring was mapped to ICD-10 concepts 75 times, most commonly to “viral infection, unspecified” (B34.9; 5 times) and “encephalitis and encephalomyelitis, unspecified” (G04.90; 5 times). CONCLUSIONS: While ACM-ICD10 was not effective at mapping COVID-19-related text to its standard code (U07.1) as recommended by the CDC, it often mapped such text to symptoms and signs of COVID-19 infection. These results suggest that commercial NER tools may need to be recalibrated periodically to account for new and emerging illnesses.
Conference/Value in Health Info
2021-05, ISPOR 2021, Montreal, Canada
Value in Health, Volume 24, Issue 5, S1 (May 2021)
Code
PIN64
Topic
Clinical Outcomes, Epidemiology & Public Health, Methodological & Statistical Research, Real World Data & Information Systems
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Clinical Outcomes Assessment, Disease Classification & Coding, Health & Insurance Records Systems
Disease
Infectious Disease (non-vaccine)