Larger, Deeper, and in Real Time: Applications of Machine Learning and Natural Language Processing on Electronic Health Records to Learn From the Patient Journey at Scale
ISPOR’s 2023 annual conference in Boston, MA in May had the use of artificial intelligence (AI) in health economics and outcomes research (HEOR) as a running theme. The Signal session presented there, “Larger, Deeper, and in Real Time: Applications of Machine Learning and Natural Language Processing on Electronic Health Records to Learn From the Patient Journey at Scale,” was no exception. Featuring 3 case studies, the session sought to describe the pragmatic impact of applying machine learning and natural language processing on electronic health records (EHRs) to generate and accelerate insights on the patient journey.
According to panel moderator Joe Vandigo, MBD, PhD, Applied Patient Experience, “if engaging patients tells us about their experiences, machine learning and natural language processing can help us understand how representative that experience is at scale, not only in a way that’s rapid so that we can incorporate it into decision making, but also in a way that we can bring in populations that we’re currently not able to reach.”
As Vandigo explains, the patient journey is often thought about through the lens of patient experience data—that is, aspects of the patient experiences that matter to the patient. These can include symptoms, the natural history of the disease, patient experience with treatments, and patients’ quality of life and individual functioning. Real-world data (RWD) “can absolutely describe what patients experience with a disease or condition. But often those things aren’t what matters to the patient.” However, there is overlap in these aspects and the collection and analysis of RWD should be informed by patient priorities. Machine learning and natural language processing “are absolutely useful” in identifying patterns around when individuals first interact with the healthcare system and describe common treatment pathways and side effects.
Vandigo states that to achieve the best ways to gather RWD will take human and AI collaboration, which could take the form of clinicians integrating AI into their clinical practice or the form of engaging patients. At the same time, researchers must be able to explain the purpose, implementation, and interpretation of these models—how clinicians integrate AI into their practice and where researchers should be engaging patients and other stakeholders in machine learning and natural language processes. Vandigo argues that for the latter, “there is a need for engaging them continuously.”
Audience polling conducted during the session found many of those who attended would prefer that the AI they use would be assistive (helping patients make decisions and acting as a second set of eyes for clinicians and researchers) rather than autonomous (AI making clinical treatment decisions that impact patient care and outcomes).
“To achieve the best ways to gather real-world data will take human and AI collaboration, which could take the form of clinicians integrating AI into their clinical practice or the form of engaging patients.”— Joe Vandigo, MBD, PhD
Getting AI Into Clinical Workflows
Ravi Parikh, MD, MPP, an oncologist at the University of Pennsylvania, runs a lab called the Human Algorithm Collaboration Lab that focuses on the “last mile” problem of AI, trying to bring AI into clinical care workflows. Rather than building “the latest and greatest” deep-learning algorithm, Parikh says he and his colleagues are testing simpler algorithms, bringing them to the point of care faster to show their use case.
The work done by this group has convinced Parikh that AI not only needs humans but actually benefits from human interface to help generate it, especially at the point of care. He described how the EHR he encountered as an oncology resident had a low-level machine learning algorithm that generated a readmission risk score for each patient based on diagnosis codes. While some clinicians might find that risk assessment useful, he says he found it distracting because patients with cancer were always scored as having a high readmission risk. “If I listened to it, I would have never discharged anyone from the hospital.”
Furthermore, the EHR’s risk assessment “didn’t correspond to levels of risks that actually matter to me,” Parikh says, and “probably wasn’t designed with an end user in mind because it doesn’t really tell the clinician what to do.” Not being tied to a clinical intervention “is one of the big bugaboos of why AI algorithms largely are viewed as extra, more sophisticated bells and whistles, as opposed to something that actually helps clinicians in their workflow.”
While clinicians are seeking to use AI in an assistive way, Parikh points out that because humans are the end users, the results are still subject to human biases. “We ought to be structuring the AI in ways that can help counter some of those heuristic and cognitive biases that are responsible for suboptimal clinical decision making, rather than just generating the most accurate tool,” he says.
When designing a human machine collaborative system, Parikh and his colleagues have found there are 3 important stakeholders: (1) the machine itself; (2) the data scientists; and (3) the clinicians.
And when considering the clinicians, although they do see all the patients’ information and theoretically have access to years of training, “there’s a lot of inter-clinician bias in terms of decision making and variability.” To engage team members on the human side needs communication between clinicians, behavioral scientists, and mixed-methods researchers to figure out how to design workflows and use technologies like AI in clinical decision making.
As Parikh points out, “even if you had the perfect machine and the perfect 100% human accuracy all the time, if you deploy it in a context that’s not ready for use (eg, if you don’t have an intervention to tie to the prediction or the diagnosis, or if it’s operationalized as a column in your EHR, or as a bell and whistle in addition to all the other bells and whistles in the intensive care unit), then it’s unlikely to engender impact.”
Seeing the Text From EHRs as Data
Selen Bozkurt, PhD, a biomedical informatics researcher formerly at Stanford University and now at Emory University, says while EHR data are rich, they are mostly in a text format, adding that “even radiology images have text reports.”
To automatically learn from this text data will take natural language models. “As text is our input, it can be a rule-based model using terminologies; it can be a machine learning model using distributional semantics; or it can be a large language model using transformers like ChatGPT,” Bozkurt says. All of these models convert text into numerical format to use for other computational purposes, such as classification or prediction, or generating other texts.
One of Bozkurt’s first research studies, about 10 years ago, was in converting the text in mammography reports into structured data fields. More recently, she and her colleagues investigated extracting missing cancer stage information from cancer registries, as they had found 30% of the EHR records were missing this information. They wound up reprocessing the notes using a knowledge base, purging redundant text and dividing the nodes of the base into smaller pieces such as words, sentences, or phrases. “This knowledge base part is important, because we want to have full control of what we are doing,” Bozkurt says. “We didn’t want something to make things up or hallucinate.” The base was created using expert knowledge ontologies, which Bozkurt states are a “great source of knowledge plus some distributional semantics.”
In the end, Bozkurt and her colleagues were able to extract 70% of missing pathological stage information and 30% of missing clinical stage information. “It was not perfect but it was pretty promising as a proof of concept,” she says.
Bozkurt says natural language processing and machine learning can unlock the power of EHRs “but It is not a straightforward procedure,” and involves “making several careful decisions. It is not like all the data get put into a black box and we accept everything blindly that they produce.”
Getting That Data Faster
Katherine Tan, PhD, a senior data scientist at Flatiron Health, says the remaining challenge of using EHR data is the huge amount of unstructured data they contain. While a trained abstractor could perform chart reviews to manually identify and extract the technical terms, this process is resource-intensive and costly. As a result, tradeoffs are made, mainly sacrificing speed.
“For example, consider a rare population where to identify just a single patient of interest requires surfacing an enormous volume of text,” Tan says. “To be able to find enough patients to power analyses, there is this constant expense of keeping up with the latest standard of care and how quickly we’re able to add new variables to our data set.”
Advances in machine learning and natural language processing enable researchers to extract information from patient charts with greater scale, flexibility, and efficiency. These natural language extraction models use natural language processing to sift through the enormous volume of text and extract the most relevant snippets. The snippets are then fed into a machine learning algorithm that outputs the clinical outcome of interest such as a biomarker name or an environment mutation detail. “The application here is not predicting information, not inferring information based on the patient chart or generating text,” Tan says. “It is a scalable and automated way of extracting information from the patient chart similar to how an abstractor would. At the end of the day, machine learning is a tool to help us do our jobs better. And for real-world evidence, that could mean getting higher quality data faster.”
ISPOR members continue to explore the uses, challenges, and parameters of AI in producing real-world evidence, sharing results through Value in Health and conference presentations on topics such as AI in drug launches, HTA assessments, and HTA assessments and drug pricing.
ISPOR started the Signal program to bring a broader understanding of innovation (beyond product innovation), with the goal of putting these issues front and center for the HEOR community. Each episode in the series is a self-contained installment and not dependent on the previous episodes. However, all of them are connected by an intent to look at the concept of innovation and experience with it from different groups of healthcare stakeholders, building foresight into how these innovations might impact healthcare decision making in the next decade.
The ISPOR Signal Program is now delivered live at our annual in-person conferences. The next ISPOR Signal titled, “EU Joint Clinical Assessment: One for All and All for One?” is scheduled to take place at ISPOR Europe 2023 on November 13, 2023 at 10:15-11:15 in Copenhagen, Denmark.
Read more about past Signal events in Value & Outcomes Spotlight
• ISPOR Generates a Signal for Transmitting Innovation
• From Measuring Costs to Measuring Outcomes: Revamping Healthcare at a System Level
• Beyond Cost-Effectiveness: Defining and Mapping Out Innovation at NICE
• Looking at the Downstream Value as Investment in Digital Health Increases
For more information and to register
www.ispor.org/signal
About the author
Christiane Truelove is a freelance medical writer based in Bristol, PA.