Skip to main navigation Skip to search Skip to main content

Data-Dependent Generalization Bounds for Multi-Class Classification

Yunwen Lei*, Urun Dogan*, Ding-Xuan Zhou*, Marius Kloft*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

Abstract

In this paper, we study data-dependent generalization error bounds that exhibit a mild dependency on the number of classes, making them suitable for multi-class learning with a large number of label classes. The bounds generally hold for empirical multi-class risk minimization algorithms using an arbitrary norm as the regularizer. Key to our analysis is new structural results for multi-class Gaussian complexities and empirical ℓ -norm covering numbers, which exploit the Lipschitz continuity of the loss function with respect to the ℓ2 - and ℓ -norm, respectively. We establish data-dependent error bounds in terms of the complexities of a linear function class defined on a finite set induced by training examples, for which we show tight lower and upper bounds. We apply the results to several prominent multi-class learning machines and show a tighter dependency on the number of classes than the state of the art. For instance, for the multi-class support vector machine of Crammer and Singer (2002), we obtain a data-dependent bound with a logarithmic dependency, which is a significant improvement of the previous square-root dependency. The experimental results are reported to verify the effectiveness of our theoretical findings.
Original languageEnglish
Article number8620322
Pages (from-to)2995-3021
JournalIEEE Transactions on Information Theory
Volume65
Issue number5
Online published21 Jan 2019
DOIs
Publication statusPublished - May 2019

Research Keywords

  • covering numbers
  • Gaussian complexities
  • generalization error bounds
  • Multi-class classification
  • Rademacher complexities

Fingerprint

Dive into the research topics of 'Data-Dependent Generalization Bounds for Multi-Class Classification'. Together they form a unique fingerprint.

Cite this