Fast data-free model compression via dictionary-pair reconstruction
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 3435–3461 |
Journal / Publication | Knowledge and Information Systems |
Volume | 65 |
Issue number | 8 |
Online published | 11 Apr 2023 |
Publication status | Published - Aug 2023 |
Link(s)
Abstract
Deep neural network (DNN) obtained satisfactory results on different vision tasks; however, they usually suffer from large models and massive parameters during model deployment. While DNN compression can reduce the memory footprint of deep model effectively, so that the deep model can be deployed on portable devices. However, most of the existing model compression methods cost lots of time, e.g., vector quantization or pruning, which makes them inept to the application that needs fast computation. In this paper, we therefore explore how to accelerate the model compression process by reducing the computation cost. Then, we propose a new model compression method, termed dictionary-pair-based fast data-free DNN compression, which aims at reducing the memory consumption of DNNs without extra training and can greatly improve the compression efficiency. Specifically, our method performs tensor decomposition of DNN model with a fast dictionary-pair learning-based reconstruction approach, which can be deployed on different weight layers (e.g., convolution and fully connected layers). Given a pre-trained DNN model, we first divide the parameters (i.e., weights) of each layer into a series of partitions for dictionary pair-driven fast reconstruction, which can potentially discover more fine-grained information and provide the possibility for parallel model compression. Then, dictionaries of less memory occupation are learned to reconstruct the weights. Moreover, automatic hyper-parameter tuning and shared-dictionary mechanism is proposed to improve the model performance and availability. Extensive experiments on popular DNN models (i.e., VGG-16, ResNet-18 and ResNet-50) showed that our proposed weight compression method can significantly reduce the memory footprint and speed up the compression process, with less performance loss. © 2023, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.
Research Area(s)
- Dictionary-pair-driven fast DNN compression, Efficient model compression, Fast weight reconstruction, Less performance loss
Citation Format(s)
Fast data-free model compression via dictionary-pair reconstruction. / Gao, Yangcheng; Zhang, Zhao; Zhang, Haijun et al.
In: Knowledge and Information Systems, Vol. 65, No. 8, 08.2023, p. 3435–3461.
In: Knowledge and Information Systems, Vol. 65, No. 8, 08.2023, p. 3435–3461.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review