Improving the Performance of Generative AI to Achieve 100% Accuracy in Data Extraction
Author(s)
Klijn S1, Teitsson S2, Reason T3, Malcolm B4, Hill N5, Benbow E6
1Bristol-Myers Squibb, Utrecht, ZH, Netherlands, 2Bristol Myers Squibb, London, LON, UK, 3Estima Scientific Ltd, South Ruislip, LON, UK, 4Bristol Myers Squibb, Uxbridge, UK, 5Bristol Myers Squibb Company, Princeton, NJ, USA, 6Estima Scientific Ltd, Ruislip, UK
Presentation Documents
OBJECTIVES: We have previously demonstrated that there is potential to use large language models (LLMs), such as GPT-4, to automate data extraction for NMA. Whilst data extraction accuracy of over 97% was achieved, there is scope to improve the performance and reliability of data extraction to 100%, before full implementation in HEOR. The aim of this study was to assess improvements in accuracy of data extraction from publications reporting overall survival in adult patients with advanced or metastatic non-small cell lung cancer (NSCLC), using a modal approach.
METHODS: An a priori defined modal algorithm was postulated, developed, and tested. This used GPT-4, via a Python API, to automatically extract survival data from NSCLC publications multiple times and then calculate the mode of each block of 20 iterations. Results were compared with the data extraction conducted (and checked) by systematic literature review and NMA experts.
RESULTS: When comparing the results of 400 iterations of the automatic data extraction with the human data extraction, GPT-4 accurately extracted over 99% of the necessary data. However, by implementing the modal algorithm it was possible to achieve a data extraction accuracy of 100% for all 20x20 blocks of data.
CONCLUSIONS: Whilst GPT-4 generally extracts the correct data, there are occasions when it fails to extract all required data from a publication. We have demonstrated an approach that improves the extraction rate and, in the case study considered, results in perfect extraction by GPT-4. This represents a useful method to demonstrate the accuracy, repeatability and reliability of data extracted. Work to apply this approach to the other automated stages of network meta-analysis is underway.
Conference/Value in Health Info
Value in Health, Volume 27, Issue 6, S1 (June 2024)
Code
MSR18
Topic
Clinical Outcomes, Methodological & Statistical Research, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Comparative Effectiveness or Efficacy, Meta-Analysis & Indirect Comparisons
Disease
No Additional Disease & Conditions/Specialized Treatment Areas, Oncology