Causal Inference and Machine Learning for Healthcare Applications

Created with DALL·E 3.

Abstract
This work applies causal inference, machine learning, and statistical methods to address public health and healthcare system challenges using both observational and randomized data.

One project develops a data-driven framework to evaluate physician performance in intensive care units (ICUs). Using tree ensemble models (XGBoost, Random Forest, and Tree Boosting Mixed Models) and explainable AI tools such as TreeSHAP, we quantify the contributions of physician evaluations and ICU departments to patient outcomes. Propensity weighting methods are used to balance patient characteristics and enable fair comparisons across physicians, with super learner–based approaches improving efficiency under model misspecification.

Another project estimates the causal effects of myocardial infarction and ischemic stroke on self-rated health using propensity weighting and matching. The results show these events significantly reduce the probability of reporting very good or excellent health.

A third study analyzes a randomized controlled trial to assess the effectiveness of Diclectin for nausea and vomiting during pregnancy, highlighting how different missing data methods can substantially affect treatment effect inference.

Yuan Bian
Yuan Bian
Postdoctoral Research Scientist