Comparison of AI-Enhanced Tools for Automating Scientific Literature Reviews
Author(s)
Maciej Grys, PhD1, Roman Casciano, MS2, Izabela Pieniazek, MSc1;
1Certara, Cracow, Poland, 2Certara, Radnor, PA, USA
1Certara, Cracow, Poland, 2Certara, Radnor, PA, USA
OBJECTIVES: Artificial Intelligence (AI) is transforming scientific literature review (LR) by accelerating and automating the review process. This study compares four commercially available AI-enhanced LR tools across various stages of the review process.
METHODS: Four AI-assisted LR tools (T1, T2, T3, T4) were evaluated for their performance in literature search, abstract and full-text screening, and data extraction using two live projects, one systematic and one targeted.
RESULTS: The tools assessed included one developed over a decade ago and three introduced in recent years. Three tools utilize publicly available large language models (LLMs) with internal adjustments, while one employed a proprietary LLM (T3). One utilized non-generative AI (T1), and two used generative AI (T3, T4). One tool enabled concept-based AI-assisted searching (T1). Three tools offered AI-driven abstract re-ranking, prioritizing relevant abstracts (T1, T2, T3). Two tools offered AI-assisted abstract screening, with AI acting as a second reviewer after training (T1, T2). One tool demonstrated a nearly tenfold lower false-negative rate than the others (T1). One tool automatically extracted all PICOS elements from abstracts and provided live AI performance statistics, expediting the identification of relevant papers (T1). Another tool categorized abstracts by answering yes/no questions, significantly reducing screening time (T2). Three tools supported AI-driven data extraction from PDFs, with non-generative AI (T1) outperforming generative AI (T3, T4) in accuracy. Importantly, reviewers maintained control over data selection and extraction at every stage. AI-assisted table extraction and critical appraisal were under development in all tools.
CONCLUSIONS: AI-enhanced LR tools effectively streamline targeted reviews, identifying key publications rapidly. However, caution is advised in systematic literature reviews (SLRs) to ensure compliance with regulations. While AI holds great potential to automate the review process, it should complement, not replace, human reviewers to maintain accuracy and reliability.
METHODS: Four AI-assisted LR tools (T1, T2, T3, T4) were evaluated for their performance in literature search, abstract and full-text screening, and data extraction using two live projects, one systematic and one targeted.
RESULTS: The tools assessed included one developed over a decade ago and three introduced in recent years. Three tools utilize publicly available large language models (LLMs) with internal adjustments, while one employed a proprietary LLM (T3). One utilized non-generative AI (T1), and two used generative AI (T3, T4). One tool enabled concept-based AI-assisted searching (T1). Three tools offered AI-driven abstract re-ranking, prioritizing relevant abstracts (T1, T2, T3). Two tools offered AI-assisted abstract screening, with AI acting as a second reviewer after training (T1, T2). One tool demonstrated a nearly tenfold lower false-negative rate than the others (T1). One tool automatically extracted all PICOS elements from abstracts and provided live AI performance statistics, expediting the identification of relevant papers (T1). Another tool categorized abstracts by answering yes/no questions, significantly reducing screening time (T2). Three tools supported AI-driven data extraction from PDFs, with non-generative AI (T1) outperforming generative AI (T3, T4) in accuracy. Importantly, reviewers maintained control over data selection and extraction at every stage. AI-assisted table extraction and critical appraisal were under development in all tools.
CONCLUSIONS: AI-enhanced LR tools effectively streamline targeted reviews, identifying key publications rapidly. However, caution is advised in systematic literature reviews (SLRs) to ensure compliance with regulations. While AI holds great potential to automate the review process, it should complement, not replace, human reviewers to maintain accuracy and reliability.
Conference/Value in Health Info
2025-05, ISPOR 2025, Montréal, Quebec, CA
Value in Health, Volume 28, Issue S1
Code
MSR56
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
No Additional Disease & Conditions/Specialized Treatment Areas