TY - JOUR
T1 - Deep Convolutional Neural Network Compression via Coupled Tensor Decomposition
AU - Sun, Weize
AU - Chen, Shaowu
AU - Huang, Lei
AU - So, Hing Cheung
AU - Xie, Min
PY - 2021/4
Y1 - 2021/4
N2 - Large neural networks have aroused impressive progress in various real world applications. However, the expensive storage and computational resources requirement for running deep networks make them problematic to be deployed on mobile devices. Recently, matrix and tensor decompositions have been employed for compressing neural networks. In this paper, we develop a simultaneous tensor decomposition technique for network optimization. The shared network structure is first discussed. Sometimes, not only the structure but also the parameters are shared to form a compressed model at the expense of degraded performance. This indicates that the weight tensors between layers within one network contain both identical components and independent components. To utilize this characteristic, two new coupled tensor train decompositions are developed for fully and partly structure sharing cases, and an alternating optimization approach is proposed for low rank tensor computation. Finally, we restore the performance of the neural network model by fine-tuning. The compression ratio of the devised approach can then be calculated. Experimental results are also included to demonstrate the benefits of our algorithm for both applications of image reconstruction and classification, using the well known datasets such as Cifar-10/Cifar-100 and ImageNet and widely used networks such as ResNet. Comparing to the state-of-the-art independent matrix and tensor decomposition based methods, our model can obtain a better network performance under the same compression ratio.
AB - Large neural networks have aroused impressive progress in various real world applications. However, the expensive storage and computational resources requirement for running deep networks make them problematic to be deployed on mobile devices. Recently, matrix and tensor decompositions have been employed for compressing neural networks. In this paper, we develop a simultaneous tensor decomposition technique for network optimization. The shared network structure is first discussed. Sometimes, not only the structure but also the parameters are shared to form a compressed model at the expense of degraded performance. This indicates that the weight tensors between layers within one network contain both identical components and independent components. To utilize this characteristic, two new coupled tensor train decompositions are developed for fully and partly structure sharing cases, and an alternating optimization approach is proposed for low rank tensor computation. Finally, we restore the performance of the neural network model by fine-tuning. The compression ratio of the devised approach can then be calculated. Experimental results are also included to demonstrate the benefits of our algorithm for both applications of image reconstruction and classification, using the well known datasets such as Cifar-10/Cifar-100 and ImageNet and widely used networks such as ResNet. Comparing to the state-of-the-art independent matrix and tensor decomposition based methods, our model can obtain a better network performance under the same compression ratio.
KW - digital holographic microscope
KW - Multimodal optical microscopy
KW - transport of intensity equation
KW - digital holographic microscope
KW - Multimodal optical microscopy
KW - transport of intensity equation
KW - digital holographic microscope
KW - Multimodal optical microscopy
KW - transport of intensity equation
UR - http://www.scopus.com/inward/record.url?scp=85096857208&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85096857208&origin=recordpage
U2 - 10.1109/JSTSP.2020.3038227
DO - 10.1109/JSTSP.2020.3038227
M3 - RGC 21 - Publication in refereed journal
SN - 1932-4553
VL - 15
SP - 603
EP - 616
JO - IEEE Journal on Selected Topics in Signal Processing
JF - IEEE Journal on Selected Topics in Signal Processing
IS - 3
ER -