A unified framework of analyzing missing data and variable selection using regularized likelihood


Date
Mar 12, 2024
Event
Xu Lab Seminar
Location
University Health Network & University of Toronto (Hybrid)

Abstract
Missing data arise commonly in applications, and research on this topic has received extensive attention in the past few decades. Various inference methods have been developed under different missing data mechanisms, including missing at random and missing not at random. The assessment of a feasible missing data mechanism is, however, difficult due to the missingness. Furthermore, analyzing such data is complicated by the presence of inactive covariates. To handle these issues, we propose a unified modeling scheme by utilizing the parametric generalized additive model to characterize missing data processes of general forms or mechanisms. Considering the generalized linear model for featuring the dependence of the response on the associated covariates, estimation procedures using the maximum likelihood and variable selection using the regularized likelihood methods are developed, and the asymptotic properties for the resultant estimators are rigorously established. The proposed methods are appealing in their flexibility and generality, and they circumvent the need to assume a particular missing data mechanism which is required by most available methods. Empirical studies demonstrate that the proposed methods result in satisfactory performance in finite sample settings. Extension to accommodating missingness in covariates is also discussed.

Yuan Bian
Yuan Bian
Incoming Postdoc in Biostatistics