Data-Dependent Generalization Bounds for Multi-Class Classification
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 8620322 |
Pages (from-to) | 2995-3021 |
Journal / Publication | IEEE Transactions on Information Theory |
Volume | 65 |
Issue number | 5 |
Online published | 21 Jan 2019 |
Publication status | Published - May 2019 |
Link(s)
Abstract
In this paper, we study data-dependent generalization error bounds that exhibit a mild dependency on the number of classes, making them suitable for multi-class learning with a large number of label classes. The bounds generally hold for empirical multi-class risk minimization algorithms using an arbitrary norm as the regularizer. Key to our analysis is new structural results for multi-class Gaussian complexities and empirical ℓ∞ -norm covering numbers, which exploit the Lipschitz continuity of the loss function with respect to the ℓ2 - and ℓ∞ -norm, respectively. We establish data-dependent error bounds in terms of the complexities of a linear function class defined on a finite set induced by training examples, for which we show tight lower and upper bounds. We apply the results to several prominent multi-class learning machines and show a tighter dependency on the number of classes than the state of the art. For instance, for the multi-class support vector machine of Crammer and Singer (2002), we obtain a data-dependent bound with a logarithmic dependency, which is a significant improvement of the previous square-root dependency. The experimental results are reported to verify the effectiveness of our theoretical findings.
Research Area(s)
- covering numbers, Gaussian complexities, generalization error bounds, Multi-class classification, Rademacher complexities
Citation Format(s)
Data-Dependent Generalization Bounds for Multi-Class Classification. / Lei, Yunwen; Dogan, Urun; Zhou, Ding-Xuan et al.
In: IEEE Transactions on Information Theory, Vol. 65, No. 5, 8620322, 05.2019, p. 2995-3021.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review