USING ARTIFICIAL INTELLIGENCE TO IMPROVE CAPTURE OF METASTATIC BREAST CANCER (BC) STATUS IN ELECTRONIC HEALTH RECORDS (EHR)

Author(s)

Agrawal S, Colano V, Chandrashekaraiah P, Vaidya VP, Inbar O, Jun MP, Stepanski EJ, Walker MS, Peevyhouse A, Narayanan B, Hyde B
Concerto Health AI, Boston, MA, USA

Presentation Documents

OBJECTIVES : Though an important prognostic feature in cancer, stage information is often missing from patient’s EHRs and unavailable in claims data. The primary objective of this study is to develop and validate an artificial intelligence model that classifies metastatic status in BC patients at their last observed timepoint (proxy for present-day) using previously collected, de-identified, retrospective EHR data.

METHODS : Approximately 32,000 BC patients were selected from the ASCO CancerLinQ (CLQ) dataset, of which 20% were metastatic. Features including diagnostic codes, laboratory values, medications and previous stage information were extracted from structured fields. Data was harmonized using medical taxonomies, and singular value decomposition was performed to reduce dimensionality. After cross validation (CV), a gradient boosted tree with a depth of 8 and 10k estimators was selected as the best performing model. Of the 32,000 patients, 50% were used for training, 15% were used for CV, 15% were used for testing, and 20% were used for the internal validation for which metastatic status was curated from clinical notes by expert nurse curators.

RESULTS : On the full curated validation set, the model had an AUC-ROC of 0.88 and a globally weighted f1 score of 0.87. The model was able to predict metastasis in BC with a positive predictive value (PPV) of 0.93 and negative predictive value (NPV) of 0.86. For the subset of patients with no recorded stage or secondary malignancy information in EHR, the model predicted metastasis with a PPV of 0.89. Compared to business rules alone, the model can correctly identify 32% more metastatic cases.

CONCLUSIONS : This model yielded high precision and recall, and thus could be an important tool for imputing missing stage information in EHRs. This could save substantial time and resources by quickly identifying patients for clinical trial enrollment or retrospective outcomes studies as compared to expert manual curation.

Conference/Value in Health Info

2019-05, ISPOR 2019, New Orleans, LA, USA

Value in Health, Volume 22, Issue S1 (2019 May)

Code

PPM11

Topic

Medical Technologies, Methodological & Statistical Research

Topic Subcategory

Artificial Intelligence, Machine Learning, Predictive Analytics, Digital Health, Missing Data, Modeling and simulation

Disease

Oncology

Explore Related HEOR by Topic


Your browser is out-of-date

ISPOR recommends that you update your browser for more security, speed and the best experience on ispor.org. Update my browser now

×