Reasons for Discontinuation of Ozempic Using a Natural Language Processing Pipeline on Unstructured Clinical Notes
Author(s)
Kumar V, Rasouliyan L, Wang Y, Aggarwal P, Long S
OMNY Health, Atlanta, GA, USA
Presentation Documents
OBJECTIVES: To understand reasons for discontinuation (r/dc) of Ozempic (semaglutide) through natural language processing (NLP) of clinical notes from electronic health records.
METHODS: Clinical note sentences of Ozempic patients from 5 health systems in the OMNY Health database from 2017- 2024 were included if they contained the strings “Ozempic” or “semaglutide.” To identify r/dc, a question-answering (QA) pipeline was constructed that queried each sentence with the question “Why was the {Ozempic|semaglutide} stopped?” Non-null answers were extracted using a transformer-based model fine-tuned for QA and mapped to 7 r/dc categories using a separately-fine-tuned text classification model having 91.4% accuracy for the task: adverse drug event (ADE); drug or disease contraindication (DDC); finance-related (FIN); symptom resolution (RES); lack of efficacy (LE); pregnancy-related (PRG); and miscellaneous (MISC). ADE were further resolved into ICD-10 codes using a licensed ICD-10 code resolution model. Results were qualitatively examined for accuracy and compared to product labeling.
RESULTS: 1.22 million sentences across 90,955 patients were included. 9,385 sentences contained a r/dc. The breakdown of r/dc was ADE: 3221 (34.3%); DDC: 1664 (17.7%); FIN: 2072 (22.1%); RES: 223 (2.4%); LE: 703 (7.5%); PRG: 79 (0.8%); MISC: 1423 (15.2%). For ADE, the top 6 resolved ICD-10 codes were R11.0 [nausea, 439 (13.6%)]; T38.5X [poisoning by, adverse effect of and underdosing of other estrogens and progestogens, 300 (9.3%)]; K63.89 [other specified diseases of intestine, 239 (7.4%)]; A04.7 [enterocolitis due to Clostridium difficile, 126 (3.9%)]; R63.4 [abnormal weight loss, 115 (3.6%)]; K59.0 [constipation, 109 (3.4%)].
CONCLUSIONS: R/dc frequencies align with intuition, given high out-of-pocket costs and frequent adverse events of Ozempic. Further, adverse events were largely gastrointestinal, consistent with product labeling. Our results demonstrate the feasibility of NLP for extracting r/dc from unstructured data. Further work is necessary to evaluate consistency across other glucagon-like peptide-1 receptor agonists and improve accuracy, optionally using large language models.
Conference/Value in Health Info
Value in Health, Volume 27, Issue 12, S2 (December 2024)
Code
MSR20
Topic
Methodological & Statistical Research, Patient-Centered Research, Real World Data & Information Systems, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Distributed Data & Research Networks, Electronic Medical & Health Records, Patient-reported Outcomes & Quality of Life Outcomes
Disease
Diabetes/Endocrine/Metabolic Disorders (including obesity), Drugs, Gastrointestinal Disorders