Abstract
The current focus on real world evidence (RWE) is occurring at a time when at least two major trends are converging. First, is the progress made in observational research design and methods over the past decade. Second, the development of numerous large observational healthcare databases around the world is creating repositories of improved data assets to support observational research.
Objective
This paper examines the implications of the improvements in observational methods and research design, as well as the growing availability of real world data for the quality of RWE. These developments have been very positive. On the other hand, unstructured data, such as medical notes, and the sparcity of data created by merging multiple data assets are not easily handled by traditional health services research statistical methods. In response, machine learning methods are gaining increased traction as potential tools for analyzing massive, complex datasets.
Conclusions
Machine learning methods have traditionally been used for classification and prediction, rather than causal inference. The prediction capabilities of machine learning are valuable by themselves. However, using machine learning for causal inference is still evolving. Machine learning can be used for hypothesis generation, followed by the application of traditional causal methods. But relatively recent developments, such as targeted maximum likelihood methods, are directly integrating machine learning with causal inference.
Authors
William H. Crown