Characterizing ResNet's Universal Approximation Capability

Chenghao Liu, Enming Liang, Minghua Chen*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Citation (Scopus)

Abstract

Since its debut in 2016, ResNet has become arguably the most favorable architecture in deep neural network (DNN) design. It effectively addresses the gradient vanishing/exploding issue in DNN training, allowing engineers to fully unleash DNN's potential in tackling challenging problems in various domains. Despite its practical success, an essential theoretical question remains largely open: how well/best can ResNet approximate functions? In this paper, we answer this question for several important function classes, including polynomials and smooth functions. In particular, we show that ResNet with constant width can approximate Lipschitz continuous function with a Lipschitz constant µ using O(c(d)(ε/µ)-d/2) tunable weights, where c(d) is a constant depending on the input dimension d and ϵ > 0 is the target approximation error. Further, we extend such a result to Lebesgue-integrable functions with the upper bound characterized by the modulus of continuity. These results indicate a factor of d reduction in the number of tunable weights compared with the classical results for ReLU networks. Our results are also order-optimal in ε, thus achieving optimal approximation rate, as they match a generalized lower bound derived in this paper. This work adds to the theoretical justifications for ResNet's stellar practical performance. © 2024 by the author(s)
Original languageEnglish
Title of host publicationProceedings of the 41st International Conference on Machine Learning
EditorsRuslan Salakhutdinov, Zico Kolter, Katherine Heller, Adrian Weller, Nuria Oliver, Jonathan Scarlett, Felix Berkenkamp
PublisherML Research Press
Pages31477-31515
Publication statusPublished - Jul 2024
Event41st International Conference on Machine Learning (ICML 2024) - Messe Wien Exhibition Congress Center, Vienna, Austria
Duration: 21 Jul 202427 Jul 2024
https://proceedings.mlr.press/v235/
https://icml.cc/

Publication series

NameProceedings of Machine Learning Research
Volume235
ISSN (Print)2640-3498

Conference

Conference41st International Conference on Machine Learning (ICML 2024)
Country/TerritoryAustria
CityVienna
Period21/07/2427/07/24
Internet address

Funding

This work is supported in part by a General Research Fund from Research Grants Council, Hong Kong (Project No. 11200223), an InnoHK initiative, The Government of the HKSAR, Laboratory for AI-Powered Financial Technologies, and a Shenzhen-Hong Kong-Macau Science & Technology Project (Category C, Project No. SGDX20220530111203026). The authors would also like to thank the anonymous reviewers for their helpful comments

Fingerprint

Dive into the research topics of 'Characterizing ResNet's Universal Approximation Capability'. Together they form a unique fingerprint.

Cite this