Long-Term Survival Prediction in Early Breast Cancer: A Machine Learning Approach With Random Survival Forest
Author(s)
Yoon H1, Han S1, Suh HS2, Park C1
1The University of Texas at Austin, Austin, Texas, TX, USA, 2College of Pharmacy, Kyung Hee University, Seoul, Korea, Republic of (South)
Presentation Documents
OBJECTIVES: Predicting breast cancer (BC) survival curves can help improve patient outcomes. Due to varying clinicopathological characteristics across age groups, an age-group specific model is necessary. Our study aims to develop survival prediction models for all-cause, BC-caused, and cardiovascular disease (CVD)-caused mortality in older women with hormone receptor-positive (HR+) early BC and to identify key prognostic factors for mortality.
METHODS: Using the 2006-2019 Surveillance, Epidemiology, and End Results Program (SEER)-Medicare database, this retrospective cohort study included women with early BC aged ≥66 who initiated adjuvant endocrine therapy (AET) between 2007 and 2009. The initial AET (anastrozole, exemestane, letrozole, tamoxifen) use date was the index date. Patients were followed for 1 year before and up to 10 years after the index date or until death. We used the random survival forest algorithm for model development across two age groups (66-79 years and 80+ years), mean area under the curve (AUROC) for performance evaluation and the Shapley Additive Explanation plot for model interpretation.
RESULTS: Among 10,104 patients (66-79 years: n=7,190; 80+ years: n=2,914), all six models demonstrate a mean AUROC greater than 0.7, indicating acceptable performance. Key all-cause mortality factors across both age groups included age, screenings for suspected conditions, and congestive heart failure. Tumor size, cancer stage, progesterone receptor status, and the presence of secondary malignancies were the key factors for BC-caused mortality. For CVD-caused mortality, key factors included congestive heart failure, heart valve disorders, and other ill-defined heart diseases.
CONCLUSIONS: We developed acceptable survival prediction models for older women with HR+ BC. The most impactful prognostic factors for mortality, irrespective of the cause, were similar between the two age groups. Cancer-related factors and cardiovascular comorbidities were key contributors to BC-caused and CVD-caused deaths respectively. Focusing on these prognostic factors could potentially lead to a reduction in mortality.
Conference/Value in Health Info
Value in Health, Volume 27, Issue 6, S1 (June 2024)
Code
MSR49
Topic
Methodological & Statistical Research
Topic Subcategory
Artificial Intelligence, Machine Learning, Predictive Analytics
Disease
Oncology