Optimal Decorrelated Score Subsampling for High-Dimensional Generalized Linear Models Under Measurement Constraints

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Journal / PublicationJournal of Computational and Graphical Statistics
Online published5 Nov 2024
Publication statusOnline published - 5 Nov 2024

Abstract

When responses of massive data are hard to obtain due to some reasons such as privacy and security, high cost and administrative management, response-free subsampling is considered. In this article, we propose a response-free decorrelated score subsampling approach to estimate and make statistical inference for a preconceived low-dimensional parameter in high-dimensional generalized linear models. The unconditional consistency and asymptotic normality of the resulting weighted subsample estimator are established using martingale techniques since the subsamples are no longer independent. The optimal response-free subsampling probabilities are derived based on A- and L-optimality criteria. Based on the optimal subsample, we further propose a more efficient and stable unweighted decorrelated score subsample estimator. The satisfactory performance of our proposed subsample estimators are demonstrated by simulation results and two real data applications. Supplementary materials for this article are available online. © 2024 American Statistical Association and Institute of Mathematical Statistics.

Research Area(s)

  • High-dimensional inference, Martingale techniques, Response-free subsampling, Unconditional asymptotics