Online bootstrap confidence intervals for the stochastic gradient descent estimator

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

30 Scopus Citations
View graph of relations

Author(s)

Detail(s)

Original languageEnglish
Journal / PublicationJournal of Machine Learning Research
Volume19
Publication statusPublished - 1 Dec 2018
Externally publishedYes

Link(s)

Abstract

In many applications involving large dataset or online learning, stochastic gradient descent (SGD) is a scalable algorithm to compute parameter estimates and has gained increasing popularity due to its numerical convenience and memory efficiency. While the asymptotic properties of SGD-based estimators have been well established, statistical inference such as interval estimation remains much unexplored. The classical bootstrap is not directly applicable if the data are not stored in memory. The plug-in method is not applicable when there is no explicit formula for the covariance matrix of the estimator. In this paper, we propose an online bootstrap procedure for the estimation of confidence intervals, which, upon the arrival of each observation, updates the SGD estimate as well as a number of randomly perturbed SGD estimates. The proposed method is easy to implement in practice. We establish its theoretical properties for a general class of models that includes linear regressions, generalized linear models, M-estimators and quantile regressions as special cases. The finite-sample performance and numerical utility is evaluated by simulation studies and real data applications.

Research Area(s)

  • Bootstrap, Generalized linear models, Interval estimation, Large datasets, M-estimators, Quantile regression, Resampling methods, Stochastic gradient descent

Bibliographic Note

Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].

Download Statistics

No data available