Incorporating Social Determinants of Health (SDOH) Into a Screening Tool Using Machine Learning to Predict Lung Cancer Diagnosis

Speaker(s)

Jackson B1, Madlock-Brown C2
1OPEN Health, Hingham, MA, USA, 2University of Iowa, Iowa City, IA, USA

OBJECTIVES: Current guidelines for lung cancer screening fail to account for social risk factors. This study aimed to develop a predictive screening tool, incorporating social determinants of health (SDOH) data, to improve model accuracy in the earlier identification of patients at risk of developing lung cancer.

METHODS: The study uses de-identified EHR data from the Research Enterprise Data Warehouse (rEDW), which covers patients from three institutions in the state of Tennessee. Baseline data from 2018 and 2019 are used to predict lung cancer diagnosis in 2020. Study participants must be considered “high risk”, defined as at least 50 years of age with smoking history, with no prior history of lung cancer. Extracted patient-level characteristics include demographic information and clinical parameters, such as comorbidities, symptoms, procedures, and laboratory test results. SDOH features within the five SDOH domains (economic stability, neighborhood and built environment, health and health care domain, education access and quality, and social and community context) are captured from the American Community Survey (ACS) (2018-2019). Patient home addresses are geocoded and linked with the ACS data to obtain SDOH data at the census-tract level.

RESULTS: The study population includes a total of 46,470 patients, of whom approximately 1,400 patients had a lung cancer diagnosis. Using decision tree, naïve Bayes, support vector machine, and artificial neural networks, we present the SDOH factors, by domain, which increase lung cancer risk. Model accuracy (AUC and F1-measure) metrics are used to evaluate predictive model performance in a validation holdout set, before and after the addition of SDOH features in the presence of traditionally examined demographic and clinical characteristics.

CONCLUSIONS: This research study highlights how patient SDOH differences contribute to disparities in lung cancer risks. Incorporating SDOH into screening tools is one step closer to achieving NCI’s National Cancer Plan’s goal to eliminate inequities.

Code

PCR265

Topic

Methodological & Statistical Research, Study Approaches

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Electronic Medical & Health Records

Disease

No Additional Disease & Conditions/Specialized Treatment Areas, Oncology