Abstract
Boosting has emerged as one of the most powerful machine learning techniques over the past three decades, captivating the attention of countless researchers. The majority of advancements in this field, however, have predominantly concentrated on numerical implementation procedures tailored for handling datasets with complete observations, often lacking theoretical justifications. In this paper, utilizing semiparametric optimal estimation approaches, we develop unbiased boosting estimation methods for data missing not at random (MNAR), and explore two strategies for adjusting the loss functions while accommodating missing effects. To address the issue of model misspecification, we further propose a multiply robust loss function that accounts for MNAR effects. We implement the proposed methods using a functional gradient descent algorithm and rigorously establish their theoretical properties, including consistency and optimization convergence. Numerical studies demonstrate that the proposed methods exhibit satisfactory performance in finite sample settings.