Utilization of High-Dimensional Propensity Score and Targeted Maximum Likelihood Estimation with Machine Learning to Improve Causal Effect Estimation in Patients with Nonvalvular Atrial Fibrillation and Hypertension

Author(s)

Zhang D1, Zhang Y2, Ahn SW2, Gruber S3, van der Laan M4, Iyer R5, Reshef S6, Tian MY7
1Teva Pharmaceuticals, Frederick, MD, USA, 2Teva Branded Pharmaceutical Products R&D, Inc.,, West Chester, PA, USA, 3TLrevolution, Cambridge, MA, USA, 4UC Berkeley, Berkeley, CA, USA, 5Teva Pharmaceuticals, West Chester, PA, USA, 6Teva Branded Pharmaceutical Products R&D, Inc., Epidemiology and Global Health Economics and Outcomes Research, Parsippany, NJ, USA, 7Teva Branded Pharmaceutical Products R&D, Inc., Real World Evidence Statistics, Skillman, NJ, USA

OBJECTIVES: Using targeted learning in high-dimensional data provides an unprecedented opportunity to improve causal effect estimation in real-world evidence generation. We examined whether integrating high-dimensional propensity score (HDPS) and/or targeted maximum likelihood estimation (TMLE) with machine learning (ML) produces robust causal estimates using healthcare claims data.

METHODS: We conducted a retrospective cohort study, using 2012-2022 MarketScan® database, to compare the effectiveness on time to first composite endpoint of ischemic stroke and systemic embolism between two treatment groups: concurrent use of anticoagulant A and calcium channel blocker (A-CCB) vs. anticoagulant B and CCB (B-CCB), among patients aged ≥18 years diagnosed with nonvalvular atrial fibrillation and hypertension . We estimated the treatment effects using hazard ratios (HRs) with 95% confidence intervals (95%CI) for inverse probability treatment weighting (IPTW), and cumulative risk ratios (CRRs) and risk differences for TMLE. For rare outcomes, CRR can approximate HR. We implemented three models: (1) IPTW with pre-specified confounders, (2) IPTW and censoring weights with pre-specified confounders and HDPS+LASSO derived covariates, and (3) TMLE with pre-specified confounders and HDPS+LASSO derived covariates. LASSO is a ML algorithm for variable selection. We compared these results with a randomized trial’s findings (HR=0.88, 95%CI=0.74-1.03 when comparing anticoagulant A vs. B in similar endpoints and population).

RESULTS: The crude incidence of the cardiovascular outcomes were 2.6 and 5.4 per 100 person-years in A-CCB (n=3625) and B-CCB (n=2477) groups, respectively. The effect estimates for the three models were: (1) HR=0.65 (95%CI=0.41-1.03) for IPTW with pre-specified confounders, (2) HR=0.85 (95%CI=0.54-1.34) for IPTW and censoring weight with HDPS and LASSO-selected variables, and (3) CRR=0.84 (95%CI=0.47-1.22) for TMLE with HDPS and LASSO-selected variables.

CONCLUSIONS: Our study demonstrates that when using healthcare claims data, HDPS and TMLE with ML may reduce bias and produce robust causal estimates that align with a previous clinical trial with similar endpoints.

Conference/Value in Health Info

2024-05, ISPOR 2024, Atlanta, GA, USA

Value in Health, Volume 27, Issue 6, S1 (June 2024)

Code

MSR48

Topic

Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Confounding, Selection Bias Correction, Causal Inference, Missing Data

Disease

Cardiovascular Disorders (including MI, Stroke, Circulatory), Drugs

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×