Abstract
Extraction of information from data is critical in the age of data science. Probability density function theoretically provides comprehensive information on the data. But, practically, different probability density models, either parametric or nonparametric, can often characterize partial features on the data, e.g., owing to model bias or less efficiency in estimation. In this paper we suggest a framework to optimally combine different density models to catch the comprehensive data features by a new information criterion (IC) based unsupervised learning approach. Our optimal information extraction is in the sense that the resultant density averaging or selected density minimises the Kullback-Leibler (KL) information loss function. Differently from the usual supervised learning IC for model selection or averaging, we first need to derive an estimator of the KL loss function in our setting, which takes the Akaike and Takeuchi information criteria as two special cases. A feasible density model averaging (DMA) procedure is accordingly suggested, with the DMA estimation achieving the lowest possible KL loss asymptotically. Further, the consistency of the weights of the DMA estimator tending to the optimal averaging weights minimizing the KL distance is obtained, and the convergence rate of our empirical weights is also derived. Simulation studies show that the DMA performs overall better and more robustly than the commonly used parametric or nonparametric density models, including kernel, finite mixture, logarithmic scoring rule and selection methods for density estimation in the literature. The real data analysis further demonstrates the performance of the proposed method.
| Original language | English |
|---|---|
| Number of pages | 40 |
| Journal | Statistica Sinica |
| Volume | 36 |
| Issue number | 3 |
| Online published | 2025 |
| DOIs | |
| Publication status | Online published - 2025 |
Bibliographical note
Research Unit(s) information for this publication is provided by the author(s) concerned.Research Keywords
- Asymptotic optimality
- Density estimation
- Density averaging
- Weight choice