TY - GEN
T1 - Dictionary Pair-based Data-Free Fast Deep Neural Network Compression
AU - Gao, Yangcheng
AU - Zhang, Zhao
AU - Zhang, Haijun
AU - Zhao, Mingbo
AU - Yang, Yi
AU - Wang, Meng
PY - 2021/12
Y1 - 2021/12
N2 - Deep neural network (DNN) compression can reduce the memory footprint of deep networks effectively, so that the deep model can be deployed on the portable devices. However, most of the existing model compression methods cost lots of time, e.g., vector quantization or pruning, which makes them inept to the real-world applications that need fast online computation. In this paper, we therefore explore how to accelerate the model compression process by reducing the computation cost. Then, we propose a new deep model compression method, termed Dictionary Pair-based Data-Free Fast DNN Compression, which aims at reducing the memory consumption of DNNs without extra training and can greatly improve the compression efficiency. Specifically, our proposed method performs tensor decomposition on the DNN model with a fast dictionary pair learning-based reconstruction approach, which can be deployed on different layers (e.g., convolution and fully-connection layers). Given a pre-trained DNN model, we first divide the parameters (i.e., weights) of each layer into a series of partitions for dictionary pair-based fast reconstruction, which can potentially discover more fine-grained information and provide the possibility for parallel model compression. Then, dictionaries of less memory occupation are learned to reconstruct the weights. Extensive experiments on popular DNNs (i.e., VGG-16, ResNet-18 and ResNet-50) showed that our proposed weight compression method can significantly reduce the memory footprint and speed up the compression process, with less performance loss.
AB - Deep neural network (DNN) compression can reduce the memory footprint of deep networks effectively, so that the deep model can be deployed on the portable devices. However, most of the existing model compression methods cost lots of time, e.g., vector quantization or pruning, which makes them inept to the real-world applications that need fast online computation. In this paper, we therefore explore how to accelerate the model compression process by reducing the computation cost. Then, we propose a new deep model compression method, termed Dictionary Pair-based Data-Free Fast DNN Compression, which aims at reducing the memory consumption of DNNs without extra training and can greatly improve the compression efficiency. Specifically, our proposed method performs tensor decomposition on the DNN model with a fast dictionary pair learning-based reconstruction approach, which can be deployed on different layers (e.g., convolution and fully-connection layers). Given a pre-trained DNN model, we first divide the parameters (i.e., weights) of each layer into a series of partitions for dictionary pair-based fast reconstruction, which can potentially discover more fine-grained information and provide the possibility for parallel model compression. Then, dictionaries of less memory occupation are learned to reconstruct the weights. Extensive experiments on popular DNNs (i.e., VGG-16, ResNet-18 and ResNet-50) showed that our proposed weight compression method can significantly reduce the memory footprint and speed up the compression process, with less performance loss.
KW - dictionary pair-based fast compression of DNNs
KW - fast weight reconstruction
KW - less performance loss
KW - Model compression efficiency
UR - http://www.scopus.com/inward/record.url?scp=85125186561&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85125186561&origin=recordpage
U2 - 10.1109/ICDM51629.2021.00022
DO - 10.1109/ICDM51629.2021.00022
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9781665423991
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 121
EP - 130
BT - Proceedings - 21st IEEE International Conference on Data Mining, ICDM 2021
A2 - Bailey, James
A2 - Miettinen, Pauli
A2 - Koh, Yun Sing
A2 - Tao, Dacheng
A2 - Wu, Xindong
PB - IEEE
T2 - 21st IEEE International Conference on Data Mining (ICDM 2021)
Y2 - 7 December 2021 through 10 December 2021
ER -