Superdeduper: Testing New AI-Powered System for Deduplicating References in Literature Reviews
Author(s)
Nowak A1, Sadowska E2, Borowiack E3
1Evidence Prime, Krakow, MA, Poland, 2Evidence Prime, Krakow, Poland, 3Evidence Prime, Kraków, Poland
Presentation Documents
OBJECTIVES: Deduplication of records is a crucial step while performing systematic reviews, also those related to the costs and cost-effectiveness outcomes. It is often a long and laborious process, but a properly performed deduplication process prevents researchers from reviewing the same references from different databases, reducing the time spent on screening. At the same time, errors in deduplication pose a risk of losing relevant studies before screening even starts.
Our aim was to retrospectively evaluate the accuracy of the AI-assisted deduplication tool implemented in Laser AI (SuperDeduper) using publicly available benchmark datasets.METHODS: We looked for gold standard datasets that were used to evaluate automatic deduplication processes in at least one reference management or literature review software. We run SuperDeduper on the same dataset to compare the number of false positives (number of unique references erroneously marked as duplicates) and false negatives (duplicate references that were missed by the method).
RESULTS: Dataset available in [1] was used to evaluate the following tools: Ovid Multifile search, Covidence, Rayyan, EndNote desktop X9, Mendeley and Zotero. On this set of 3130 records, SuperDeduper performed best, by correctly identifying 1177 duplicates with no false positives and 61 false negatives (specificity: 100%, sensitivity: 95%).
CONCLUSIONS: Our results suggest that the SuperDeduper module is a safe and effective method to remove duplicates without human supervision. Further work is ongoing to identify further benchmark datasets and results on them, as well as comparison with other methods, will be presented during the conference.
[1] McKeown, S., Mir, Z.M. Considerations for conducting systematic reviews: evaluating the performance of different methods for de-duplicating references. Syst Rev 10, 38 (2021). https://doi.org/10.1186/s13643-021-01583-yConference/Value in Health Info
Value in Health, Volume 27, Issue 12, S2 (December 2024)
Code
MSR7
Topic
Methodological & Statistical Research, Study Approaches
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis
Disease
No Additional Disease & Conditions/Specialized Treatment Areas