In May 2024, the ISPOR Student Network Education Committee organized an insightful webinar exploring the transformative potential of Artificial Intelligence (AI) in systematic reviews. The session showcased the significant advancements AI brings to the field, enhancing efficiency, accuracy, and the overall robustness of evidence synthesis. Esteemed speakers shared their expertise and experiences, shedding light on how AI is revolutionizing traditional systematic review processes and the challenges that still need to be addressed. This brief summarizes highlights from these experts.
Harnessing the Power of AI in Systematic Reviews: How Can We Do Better?
Presented by: James Thomas, Professor of Social Research and Policy, EPPI Centre, UCL Social Research Institute, University College London
Conventional evidence synthesis processes, undertaken manually, are both time- and resource-intensive. This results in systematic reviews and other evidence synthesis products taking extended periods to produce, thereby delaying the provision of crucial evidence for decision-making. These reviews can quickly become outdated, especially in rapidly evolving fields, necessitating continuous surveillance and updating (e.g., living reviews or living evidence maps). Professor Thomas emphasized on the essential role of automation technologies in enhancing efficiency, reduce time and costs, and maintain rigor and reliability.
Machine learning already contributes significantly to this endeavor. Unsupervised approaches enable automatic clustering of records and visualization of search results. Study classification, one of the most reliable techniques, reduces manual effort by identifying randomized trials automatically. Additionally, systematic review software products often include features that allow the scope of a review to be refined during the screening process, prioritizing the most relevant records early on. As an example, the EPPI Reviewer platform maintains living reviews through a combination of machine learning and integration with the OpenAlex bibliographic database. Machine learning models assess new records for eligibility, automatically incorporating relevant records into the review.
Despite these advancements, Professor Thomas indicated that most reviews made minimal use of these features. For instance, although screening prioritization algorithms can successfully identify relevant records early, reviewers often screen all records manually due to the absence of widely accepted stopping rules. This lack of adoption could hinder the field of evidence synthesis, as numerous new tools promise extensive automation of traditionally manual processes. The risk lies in either missing the opportunity to enhance efficiency or compromising established standards of rigor and transparency. Notably, many new AI tools, such as large language models for zero-shot learning and generative AI for data extraction and risk of bias assessments, demonstrate impressive capabilities. However, there is a significant gap in the evidence base to guide the appropriate use of these tools. Addressing this gap is crucial for the field.
The Future of Systematic Reviews: Integrating AI with Human Expertise
Presented by: Artur Nowak, Chief Technology Officer, Evidence Prime, Krakow, Poland
Artur Nowak's presentation focused on the transformative potential of AI in systematic reviews, exploring whether "black box" AI could replace traditional methods, the integration of AI into current processes, and the future of systematic reviews with AI advancements. He highlighted the progression from the First Industrial Revolution to the Fourth Industrial Revolution, emphasizing the significant impact of transformative technologies such as railroads, telegraphs, computers, and now AI and robotics.
Systematic reviews are characterized by their explicit reproducible methodology and comprehensive search strategies to identify all relevant studies. Nowak questioned whether a closed, opaque AI system could replace this transparent and meticulous process. He suggested that instead of asking an AI model to produce the final systematic review manuscript, it would be more practical to have AI operate within a systematic review tool, allowing researchers to inspect intermediary results. This approach is akin to asking a colleague to use a review tool rather than handing over a final paper. He predicted that systematic review tools will remain essential, with increasing automation of tasks such as data extraction and deduplication. Nowak emphasized that AI will be used to perform reviews through specific tools, allowing for human adjustments and oversight rather than creating a systematic review manuscript independently. The need for AI tools to support, rather than replace, human judgment and expertise in systematic reviews was stressed.
Nowak provided practical examples of how AI can assist in data extraction processes, improving efficiency and accuracy. He explained how Laser AI can expedite data extraction while keeping the process transparent and adjustable by users. One notable example included how Laser AI handles table extraction, where a Computer Vision model detects the table structure, and the user specifies how the table from the paper should be mapped to the extraction form.
The presentation concluded with an acknowledgment of AI's significant potential to enhance systematic review processes by automating repetitive tasks and aiding data extraction. However, the core principles and transparent methodologies of systematic reviews will remain indispensable. Future advancements will likely see AI serving as a powerful tool to support researchers rather than replacing the systematic review process entirely.
Will AI for Screening and Data Extraction be the Key Enabler of Living Systematic Review?
Presented by: Piet Hanegraaf, CEO of Pitts.ai, Zeist, The Netherlands
AI offers significant advantages for systematic reviews for two key reasons, as highlighted by Piet Hanegraaf. First, it addresses the challenge of maintaining up-to-date reviews. Systematic reviews quickly become outdated as new evidence emerges, though the medical questions they address remain relevant. Living systematic reviews could mitigate this issue, but the workload associated with keeping them current is substantial. AI can alleviate this burden by automating screening and data extraction, thereby facilitating the broader adoption of living systematic reviews. Second, AI helps address the issue of completeness. Systematic reviews often have limited scope due to constraints in budget and manpower, leading to gaps in evidence collection. AI can reduce this workload and enhance the capacity to cover a wider range of interventions and topics, which is particularly relevant for health economics and can accelerate the health technology assessment (HTA) process.
Large language models (LLMs) play a crucial role in semi-automating screening and data extraction. Platforms like Pitts leverage AI to assist with these tasks. For example, the Pitts user interface supports prompt engineering to optimize semi-automated data extraction, utilizing techniques such as few-shot prompting to improve performance and chaining prompts to manage subtasks. Additionally, the platform includes features for splitting datasets into engineering and validation subsets to test prompt generalizability, and it can prompt LLMs to quote text in PDFs for manual verification of data items or exclusion reasons.
Future optimizations of LLMs for systematic reviews include fine-tuning foundation models for specific tasks such as screening and data extraction. Other potential enhancements involve retrieval optimization using vector embeddings, preprocessing optimizations like converting PDFs to machine-readable text, alternative algorithm architectures such as encoder-decoder models, and active learning with priority screening algorithms. These improvements are expected to enhance the efficiency of LLMs in handling screening and data extraction tasks.
In summary, AI, particularly large language models, has the potential to significantly accelerate the screening and data extraction components of systematic reviews, thereby speeding up the HTA process. AI can function both as an assistant, providing sentence highlighting and suggestions, or as an autonomous agent for data extraction. However, rigorous validation of AI tools remains essential.