Abstract
Robustness and uncertainty estimation are crucial to the safety of deep neural networks deployed on the edge. The deep ensemble model, composed of a set of individual deep neural networks (namely members), has strong performance in accuracy, uncertainty estimation, and robustness to out-of-distribution data and adversarial attacks. However, the storage and memory consumption increases linearly with the number of members within an ensemble. Previous works focus on selecting better members, layer-wise low-rank approximation of ensemble parameters and designing partial ensemble model for reducing the ensemble size, thus lowering storage and memory consumption. In this work, we pay attention to the quantization of the ensemble, which serves as the last mile of network deployment. We propose a differentiable and parallelizable bit sharing scheme that allows the members to share the less significant bits of parameters, without hurting the performance, leaving alone the more significant bits. The intuition is that, numerically, more significant bits (e.g., the bit for the sign) are more useful in distinguishing a member from other members. For real deployment of the bit-sharing scheme, we further propose an efficient encoding-decoding scheme with minimal storage overhead. Experimental results show that, BitsEnsemble reduces the storage size of ensemble for over 22×, with only 0.36× increase in training latency, and no sacrifice of inference latency. The code is available in https://github.com/ralphc1212/bitsensemble.
| Original language | English |
|---|---|
| Pages (from-to) | 4397-4408 |
| Journal | IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems |
| Volume | 41 |
| Issue number | 11 |
| Online published | 10 Aug 2022 |
| DOIs | |
| Publication status | Published - Nov 2022 |
Funding
This work was supported in part by the Research Grants Council of the Hong Kong, SAR, China, under Project CityU 11219319.
Research Keywords
- Bits-Ensemble
- Deep ensemble
- edge computing
- Indexes
- Matrix decomposition
- neural network quantization
- Neural networks
- Quantization (signal)
- Robustness
- Training
- Uncertainty
Publisher's Copyright Statement
- COPYRIGHT TERMS OF DEPOSITED POSTPRINT FILE: © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Cui, Y., Wu, S., Li, Q., Chan, A. B., Kuo, T-W., & Xue, C. J. (2022). Bits-Ensemble: Towards Light-Weight Robust Deep Ensemble by Bits-Sharing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41(11), 4397-4408. https://doi.org/10.1109/TCAD.2022.3197986
RGC Funding Information
- RGC-funded
Fingerprint
Dive into the research topics of 'Bits-Ensemble: Towards Light-Weight Robust Deep Ensemble by Bits-Sharing'. Together they form a unique fingerprint.Projects
- 1 Finished
-
GRF: Multi-layer Compression for Lean Flash Storage
XUE, C. J. (Principal Investigator / Project Coordinator)
1/07/19 → 1/06/23
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver