Distributed Mallows Model Averaging for Ridge Regressions

Haili Zhang, Alan T. K. Wan, Kang You, Guohua Zou*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

1 Citation (Scopus)

Abstract

Ridge regression is an effective tool to handle multicollinearity in regressions. It is also an essential type of shrinkage and regularization methods and is widely used in big data and distributed data applications. The divide and conquer trick, which combines the estimator in each subset with equal weight, is commonly applied in distributed data. To overcome multicollinearity and improve estimation accuracy in the presence of distributed data, we propose a Mallows-type model averaging method for ridge regressions, which combines estimators from all subsets. Our method is proved to be asymptotically optimal allowing the number of subsets and the dimension of variables to be divergent. The consistency of the resultant weight estimators tending to the theoretically optimal weights is also derived. Furthermore, the asymptotic normality of the model averaging estimator is demonstrated. Our simulation study and real data analysis show that the proposed model averaging method often performs better than commonly used model selection and model averaging methods in distributed data cases. © Springer-Verlag GmbH Germany & The Editorial Office of AMS 2025.
Original languageEnglish
Pages (from-to)780-826
JournalActa Mathematica Sinica, English Series
Volume41
Issue number2
Online published15 Feb 2025
DOIs
Publication statusPublished - Feb 2025

Research Keywords

  • 62F12
  • 62H10
  • 62J07
  • Asymptotic normality
  • asymptotic optimality
  • consistency
  • distributed data
  • Mallows model averaging
  • ridge regression

Fingerprint

Dive into the research topics of 'Distributed Mallows Model Averaging for Ridge Regressions'. Together they form a unique fingerprint.

Cite this