Superdeduper: Testing New AI-Powered System for Deduplicating References in Literature Reviews

Author(s)

Nowak A1, Sadowska E2, Borowiack E3
1Evidence Prime, Krakow, MA, Poland, 2Evidence Prime, Krakow, Poland, 3Evidence Prime, Kraków, Poland

OBJECTIVES: Deduplication of records is a crucial step while performing systematic reviews, also those related to the costs and cost-effectiveness outcomes. It is often a long and laborious process, but a properly performed deduplication process prevents researchers from reviewing the same references from different databases, reducing the time spent on screening. At the same time, errors in deduplication pose a risk of losing relevant studies before screening even starts.

Our aim was to retrospectively evaluate the accuracy of the AI-assisted deduplication tool implemented in Laser AI (SuperDeduper) using publicly available benchmark datasets.

METHODS: We looked for gold standard datasets that were used to evaluate automatic deduplication processes in at least one reference management or literature review software. We run SuperDeduper on the same dataset to compare the number of false positives (number of unique references erroneously marked as duplicates) and false negatives (duplicate references that were missed by the method).

RESULTS: Dataset available in [1] was used to evaluate the following tools: Ovid Multifile search, Covidence, Rayyan, EndNote desktop X9, Mendeley and Zotero. On this set of 3130 records, SuperDeduper performed best, by correctly identifying 1177 duplicates with no false positives and 61 false negatives (specificity: 100%, sensitivity: 95%).

CONCLUSIONS: Our results suggest that the SuperDeduper module is a safe and effective method to remove duplicates without human supervision. Further work is ongoing to identify further benchmark datasets and results on them, as well as comparison with other methods, will be presented during the conference.

[1] McKeown, S., Mir, Z.M. Considerations for conducting systematic reviews: evaluating the performance of different methods for de-duplicating references. Syst Rev 10, 38 (2021). https://doi.org/10.1186/s13643-021-01583-y

Conference/Value in Health Info

2024-11, ISPOR Europe 2024, Barcelona, Spain

Value in Health, Volume 27, Issue 12, S2 (December 2024)

Code

MSR7

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×