Identifying COVID-19 Patients from Unstructured Notes: Performance of a Commercial Clinical Named Entity Recognition System

Author(s)

Kumar V, Rasouliyan L, Long S, Rao MB
OMNY Health, Atlanta, GA, USA

OBJECTIVES: The effectiveness of commercial clinical named entity recognition (NER) systems in recognizing COVID-19-related concepts is not well understood. The objective of this study was to measure the performance of the Amazon Comprehend Medical ICD-10-CM Ontology Linking API (ACM-ICD10) for detecting the presence of COVID-19-related concepts in a clinical document sample obtained from COVID-19 positive patients.

METHODS: We obtained all clinical notes of patients who had tested positive for COVID-19 at a large medical institution in the OMNY Health System Database during one month in late 2020. Patients were excluded if the U07.1 ICD-10-CM code was not present in their structured diagnosis data. We processed the notes using ACM-ICD10 and obtained the ICD-10 concept mappings for each note.

RESULTS: After exclusions, 63 patients remained for analysis. These 63 patients had 791 diagnoses in their structured diagnosis data. Processing the 1,488 notes of the 63 patients through ACM-ICD10 yielded 81,599 ICD-10 concept mappings, none of which contained the U07.1 code. However, mappings were observed that related to symptoms and alternative diagnoses of COVID-19 including “dyspnea, unspecified” (R06.00; 518 times; 0.6%), “shortness of breath” (R06.02; 505 times; 0.6%), “unspecified infectious disease” (B99.9; 466 times; 0.6%), and “viral infection, unspecified” (B34.9; 292 times; 0.4%). Specifically, text containing the “covid” substring was mapped to ICD-10 concepts 75 times, most commonly to “viral infection, unspecified” (B34.9; 5 times) and “encephalitis and encephalomyelitis, unspecified” (G04.90; 5 times).

CONCLUSIONS: While ACM-ICD10 was not effective at mapping COVID-19-related text to its standard code (U07.1) as recommended by the CDC, it often mapped such text to symptoms and signs of COVID-19 infection. These results suggest that commercial NER tools may need to be recalibrated periodically to account for new and emerging illnesses.

Conference/Value in Health Info

2021-05, ISPOR 2021, Montreal, Canada

Value in Health, Volume 24, Issue 5, S1 (May 2021)

Code

PIN64

Topic

Clinical Outcomes, Epidemiology & Public Health, Methodological & Statistical Research, Real World Data & Information Systems

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Clinical Outcomes Assessment, Disease Classification & Coding, Health & Insurance Records Systems

Disease

Infectious Disease (non-vaccine)

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×