Skip to main navigation Skip to search Skip to main content

Additive partially linear models for massive heterogeneous data

  • Binhuan Wang
  • , Yixin Fang*
  • , Heng Lian
  • , Hua Liang
  • *Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

54 Downloads (CityUHK Scholars)

Abstract

We consider an additive partially linear framework for modelling massive heterogeneous data. The major goal is to extract multiple common features simultaneously across all sub-populations while exploring heterogeneity of each sub-population. We propose an aggregation type of estimators for the commonality parameters that possess the asymptotic optimal bounds and the asymptotic distributions as if there were no heterogeneity. This oracle result holds when the number of sub-populations does not grow too fast and the tuning parameters are selected carefully. A plug-in estimator for the heterogeneity parameter is further constructed, and shown to possess the asymptotic distribution as if the commonality information were available. Furthermore, we develop a heterogeneity test for the linear components and a homogeneity test for the non-linear components accordingly. The performance of the proposed methods is evaluated via simulation studies and an application to the Medicare Provider Utilization and Payment data.
Original languageEnglish
Pages (from-to)391-431
JournalElectronic Journal of Statistics
Volume13
Issue number1
Online published9 Feb 2019
DOIs
Publication statusPublished - 2019

Research Keywords

  • Divide-and-conquer
  • Heterogeneity
  • Homogeneity
  • Oracle property
  • Regression splines

Publisher's Copyright Statement

  • This full text is made available under CC-BY 4.0. https://creativecommons.org/licenses/by/4.0/

Fingerprint

Dive into the research topics of 'Additive partially linear models for massive heterogeneous data'. Together they form a unique fingerprint.

Cite this