Frequency and Type of Errors in Data Extraction Within Systematic Literature Reviews

Author(s)

Rice H1, Roussi K2, King E1, Martin A1
1Crystallise, Stanford-le-Hope, UK, 2Crystallise, Basildon, ESS, UK

OBJECTIVES: To examine the frequency and type of errors in data extraction (DE) within systematic literature reviews (SLRs).

METHODS: We analysed DE checking sheets from eight SLRs varying in topic and size conducted previously by our organisation between 2022 and 2023. The proportion of papers with errors in each SLR and the total number and type of errors per paper and per project were calculated. A score-based approach was devised to assess the difficulty of extraction, based on the publication type (full text/ abstract), whether the file was editable, whether it was highlighted ahead of DE, the number of pages and whether it was a new or updated SLR.

RESULTS: In total, 59% of papers included in all SLRs had at least one error at initial DE that was corrected during checking. Data were extracted correctly in 85.52% of 96,675 data points evaluated. The most common error was misidentification (8.23%), when additional relevant data from the paper were identified by the checker. Incorrect data, where the original value was incorrect, occurred in 2.26% of data points. Other changes were made to the DE by the checker in 3.89% of data points (e.g. inserting comments). Data misidentification (e.g. correct value but in the wrong column) occurred in 0.49% of data points. No obvious pattern was found between the duration of DE or the paper DE difficulty score and the DE error rate.

CONCLUSIONS: Data extraction is an essential part of SLRs, however, it is error-prone. Other studies have identified DE error rates of 0.5% to 15% and at least one error in 66.8% to 99.3% of papers in published SLRs so the >85% accuracy in our process before pre-publication checking compares favourably. Methods to clarify all outcomes to be extracted before DE starts should be explored to reduce omissions.

Conference/Value in Health Info

2024-11, ISPOR Europe 2024, Barcelona, Spain

Value in Health, Volume 27, Issue 12, S2 (December 2024)

Code

SA83

Topic

Study Approaches

Topic Subcategory

Literature Review & Synthesis

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×