Opportunities and Limitations in the Use of AI to Assist With Data Extraction in Systematic Literature Reviews
Author(s)
Roussi K1, Rice H2, King E2, Martin A2
1Crystallise, Basildon, ESS, UK, 2Crystallise, Stanford-le-Hope, UK
Presentation Documents
OBJECTIVES: Data extraction (DE) is a time-consuming and error-prone component of a systematic literature review (SLR). We aimed to assess technical factors affecting DE efficiency by humans and evaluate how far AI tools can increase DE accuracy and speed.
METHODS: Data on the study design, size, objective, inclusion/exclusion criteria, key findings and baseline characteristics (age, gender, ethnicity) were manually extracted from 10 conference abstracts, 10 editable full-text PDFs and 6 non-editable full-text PDFs (i.e. scanned/ photocopied version of the original document) by an experienced systematic reviewer. Free versions of the Elicit and Perplexity AI platforms and a subscription version of aiPDF were asked to extract the same data into the same Excel DE template or equivalent table.
RESULTS: The duration of manual data extraction increased with increasing complexity of the study type/format (conference abstract = mean 10.18 minutes each, editable PDF = 18.13 minutes, non-editable PDF = 28.0 minutes). Elicit allowed the selection of bespoke outcome categories for data extraction, which was generally accurate and performed within seconds. However, outcome selection had to be replicated manually for each study, and export to a CSV file was only possible with paid subscription. Perplexity required a prompt to specify the data to be extracted. Some data were extracted correctly, but other parts of the output were fabricated. The aiPDF platform was not able to complete the DE, due to an inability to cope with ambiguities in the DE template.
CONCLUSIONS: Despite the quick evolution of AI tools, there are still limitations pertaining to their use, delaying their effective incorporation into the SLR process. The results of this work highlight the need to reevaluate the structure and layout of current data extraction sheets, into a comprehensive and clearer format, which can be more easily understood by both online AI tools and human researchers.
Conference/Value in Health Info
Value in Health, Volume 27, Issue 12, S2 (December 2024)
Code
SA63
Topic
Study Approaches
Topic Subcategory
Literature Review & Synthesis
Disease
No Additional Disease & Conditions/Specialized Treatment Areas