Assessing the Representativeness of Real-World Claims Databases

Author(s)

Stephenson J1, Teng CC2, Harris K3
1Carelon Research, WILMINGTON, DE, USA, 2Carelon Research, Wilmington, DE, USA, 3Carelon Research, Willmington, DE, USA

OBJECTIVES: Despite their widespread use, little is known about the representativeness of real-world claims databases. This study assesses the representativeness of a large, US claims database using the 2020 US Census population as a benchmark.

METHODS: The Healthcare Integrated Research Database (HIRD) is a large administrative claims database maintained by Carelon Research for health-related research. We assessed representativeness by comparing the 2020 HIRD researchable population consisting of individuals enrolled in commercial and Medicare health plans to self-reported data from the 2020 Census population for a common set of demographic characteristics. The characteristics included sex (2-categories), age (5-year categories), region (4-categories), and race/ethnicity (5-categories).

We compared the probability distributions for each characteristic using two alternative measures of similarity. The standardized mean difference (SMD) assessed the magnitude or effect size of the difference where 0.2 represents a small effect, 0.5 a medium effect, and 0.8 a large effect. The overlap index (η) measured the degree of overlap between the two distributions where 0% means no overlap and 100% means complete overlap.

RESULTS: Comparing the 2020 US Census (N=331,449,281) and 2020 HIRD commercial and Medicare (N=24,774,264) populations, we determined for sex, SMD=0.02 and η=99.2%; for age, SMD=0.19 and η=92.0%; for region, SMD=0.16 and η=94.8%; and for race/ethnicity, SMD=0.66 and η=86.8%.

CONCLUSIONS: We found the 2020 HIRD population to be highly representative of the 2020 US Census population in terms of sex, age, and region, while race/ethnicity appeared to be less representative. Differing modes of determining race/ethnicity may have potentially impacted this comparison. The HIRD race/ethnicity information was determined using multiple methods (e.g., self-report, imputation), whereas the US Census was self-reported.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Code

RWD100

Topic

Real World Data & Information Systems

Topic Subcategory

Health & Insurance Records Systems, Reproducibility & Replicability

Disease

No Additional Disease & Conditions/Specialized Treatment Areas

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×