Inference and Variable Selection with Missing Data

Last updated on Jul 11, 2025

How AI thinks about missing data. Created with DALL·E 3.

Missing data arise commonly in applications, and research on this topic has received extensive attention in the past few decades. Various inference methods have been developed under different missing data mechanisms, including missing at random and missing not at random. The assessment of a feasible missing data mechanism is, however, difficult due to the lack of validation data. The problem is further complicated by the presence of spurious variables in covariates. Focusing on missingness in the response variable, a unified modeling scheme is proposed by utilizing the parametric generalized additive model to characterize various types of missing data processes. Taking the generalized linear model to facilitate the dependence of the response on the associated covariates, the concurrent estimation and variable selection procedures are developed using regularized likelihood, and the asymptotic properties for the resultant estimators are rigorously established. The proposed methods are appealing in their flexibility and generality; they circumvent the need of assuming a particular missing data mechanism that is required by most available methods. Empirical studies demonstrate that the proposed methods result in satisfactory performance in finite sample settings. Extensions to accommodating missingness in both the response and covariates are also discussed.

Publications

Statistical Inference and Learning with Incomplete Data

PhD thesis, University of Western Ontario, 2024.

Yuan Bian

Statistical Inference and Learning with Incomplete Data

A unified framework of analyzing missing data and variable selection using regularized likelihood

Computational Statistics and Data Analysis, 194(6): Article 107919, 2024.

Yuan Bian, Grace Y. Yi, Wenqing He

A unified framework of analyzing missing data and variable selection using regularized likelihood

Events

A unified framework of analyzing missing data and variable selection using regularized likelihood

Joint Meetings of Taipei International Statistical Symposium and International Chinese Statistical Association International Conference, Invited Talk, Presented by Wenqing He

Dec 18, 2025 Academia Sinica, China

A unified framework of analyzing missing data and variable selection using regularized likelihood

A unified framework of analyzing missing data and variable selection using regularized likelihood

Joint Conference on Statistics and Data Science in China, Invited Talk, Presented by Wenqing He

Jul 12, 2025 Taixuhu Holiday Hotel, China

A unified framework of analyzing missing data and variable selection using regularized likelihood

A unified framework of analyzing missing data and variable selection using regularized likelihood

International Conference on Statistics and Data Science, Invited Talk, Presented by Wenqing He

Jun 23, 2025 The Harbour Centre, Canada

A unified framework of analyzing missing data and variable selection using regularized likelihood

Statistical inference and learning with incomplete data

PhD Public Lecture, Invited Talk

Aug 26, 2024 University of Western Ontario, Canada

Statistical inference and learning with incomplete data

A unified framework of analyzing missing data and variable selection using regularized likelihood

World Congress in Probability and Statistics, Invited Talk, Presented by Wenqing He

Aug 13, 2024 Ruhr University Bochum, Germany

A unified framework of analyzing missing data and variable selection using regularized likelihood

A unified framework of analyzing missing data and variable selection using regularized likelihood

Xu Lab Seminar, Invited Talk

Mar 12, 2024 University Health Network & University of Toronto, Canada (Hybrid)

A unified framework of analyzing missing data and variable selection using regularized likelihood

A unified framework of analyzing missing data and variable selection using regularized likelihood

Canadian Statistical Sciences Institute Annual Showcase, Contributed Talk

Nov 17, 2023 Virtual

A unified framework of analyzing missing data and variable selection using regularized likelihood

A unified framework of analyzing missing data and variable selection using regularized likelihood

Canadian Statistical Sciences Institute - National Institute of Statistical Sciences Health Data Science Workshop, Poster Presentation

Aug 3, 2023 University of Waterloo, Canada

A unified framework of analyzing missing data and variable selection using regularized likelihood

A unified framework of analyzing data with response missingness using regularized likelihood

Statistical Data Science Conference, Contributed Talk

Jun 4, 2023 University of British Columbia, Canada

A unified framework of analyzing data with response missingness using regularized likelihood

A unified framework of analyzing data with response missingness using regularized likelihood

Graduate Colloquium, Invited Talk

Feb 16, 2023 University of Western Ontario, Canada

A unified framework of analyzing data with response missingness using regularized likelihood

Analysis of missing data with regularized likelihood

Statistical Society of Canada Annual Meeting, Contributed Talk

Jun 11, 2021 Virtual due to COVID-19 pandemic

Analysis of missing data with regularized likelihood