Matching Insights From Clinical Experts and Generative AI for JCA PICO Validation

Author(s)

Benbow E¹, Klijn S², Jones C³, Varol N⁴, Malcolm B⁵, Reason T⁶, Chevli M⁴, Teitsson S⁴
¹Estima Scientific Ltd, Ruislip, UK, ²Bristol Myers Squibb, Utrecht, ZH, Netherlands, ³Estima Scientific Ltd, London, LON, UK, ⁴Bristol Myers Squibb, Uxbridge, LON, UK, ⁵Bristol Myers Squibb, Middlesex, LON, UK, ⁶Estima Scientific Ltd, South Ruislip, LON, UK

Presentation Documents

Benbow_E_PICO_alignment_31Oct2024_FINAL143643.pdf

OBJECTIVES: The JCA process uses the Patient, Intervention, Comparator, Outcome (PICO) framework and requests that each EU member state put forward their PICO requirements. This could potentially introduce many PICO sets that need consideration within a JCA submission. Thus, it would be beneficial to have an automated process able to quickly determine which PICO sets align with a registrational trial’s PICO. We have investigated whether large language models (LLMs) can determine the alignment of JCA PICO populations, as predicted by clinical experts, with the population of a target registrational trial, using a case study in patients with relapsed refractory multiple myeloma (RRMM).

METHODS: Twenty predicted JCA PICO populations were identified for patients with RRMM and ≥1 prior line. We used a modal approach and provided prompts and contextual information to two LLMs (Claude 3 Opus, GPT-4) accessing their APIs through Python. Alignment was defined as “Full” (trial population = JCA population), “Partial (subgroup)” (trial population subgroup of JCA population), “Partial (overlap)” (trial population overlaps with JCA population), “None” (no overlap). Accuracy of alignment categorization for the populations was determined by comparing the LLM outputs to alignment categorization by clinical experts.

RESULTS: Human classification of the alignment of the 20 populations was partial (subgroup) (“PS”) for three and partial (overlap) (“PO”) for 17. Claude was correct for 18/20, with 2 misclassifications (Full instead of PO; PO instead of PS). GPT was also correct for 18/20, with 2 misclassifications (both PO instead of PS). Potential ambiguity in the population definition for the two populations was likely to have caused the mis-categorization.

CONCLUSIONS: If appropriate context is provided, LLMs are capable of understanding complex epidemiological concepts and categorizing the alignment of two populations. Thus, LLMs can be used to automate this categorization of PICOs within the JCA process.

Conference/Value in Health Info

2024-11, ISPOR Europe 2024, Barcelona, Spain

Value in Health, Volume 27, Issue 12, S2 (December 2024)

Code

HTA99

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Meta-Analysis & Indirect Comparisons

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, Oncology

Explore Related HEOR by Topic

Methodology

Presentation