TY - JOUR
T1 - Semiparametric recovery of central dimension reduction space with nonignorable nonresponse
AU - Zheng, Siming
AU - Wan, Alan T. K.
AU - Zhou, Yong
PY - 2024/5
Y1 - 2024/5
N2 - Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are nonignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy-to-implement SDR estimator based on a semiparametric propensity score function for response data with non-ignorable missing values. We refer to it as the dimension reduction-based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction-based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non-trivial extension of the existing literature, due to the technical challenges posed by nonignorable missingness. All the technical proofs of the theorems are given in the Appendix S1. © 2023 Netherlands Society for Statistics and Operations Research.
AB - Sufficient dimension reduction (SDR) methods are effective tools for handling high dimensional data. Classical SDR methods are developed under the assumption that the data are completely observed. When the data are incomplete due to missing values, SDR has only been considered when the data are randomly missing, but not when they are nonignorably missing, which is arguably more difficult to handle due to the missing values' dependence on the reasons they are missing. The purpose of this paper is to fill this void. We propose an intuitive, easy-to-implement SDR estimator based on a semiparametric propensity score function for response data with non-ignorable missing values. We refer to it as the dimension reduction-based imputed estimator. We establish the theoretical properties of this estimator and examine its empirical performance via an extensive numerical study on real and simulated data. As well, we compare the performance of our proposed dimension reduction-based imputed estimator with two competing estimators, including the fusion refined estimator and cumulative slicing estimator. A distinguishing feature of our method is that it requires no validation sample. The SDR theory developed in this paper is a non-trivial extension of the existing literature, due to the technical challenges posed by nonignorable missingness. All the technical proofs of the theorems are given in the Appendix S1. © 2023 Netherlands Society for Statistics and Operations Research.
KW - central subspace
KW - nonignorable nonresponse
KW - nonparametric imputation
KW - propensity score
KW - sufficient dimension reduction
UR - http://www.scopus.com/inward/record.url?scp=85171432233&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85171432233&origin=recordpage
U2 - 10.1111/stan.12321
DO - 10.1111/stan.12321
M3 - RGC 21 - Publication in refereed journal
SN - 0039-0402
VL - 78
SP - 374
EP - 396
JO - Statistica Neerlandica
JF - Statistica Neerlandica
IS - 2
ER -