Deep Neural Network Acceleration Based on Low-Rank Approximated Channel Pruning

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

16 Scopus Citations
View graph of relations

Author(s)

  • Zhen Chen
  • Zhibo Chen
  • Jianxin Lin
  • Sen Liu
  • Weiping Li

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number8948329
Pages (from-to)1232-1244
Journal / PublicationIEEE Transactions on Circuits and Systems I: Regular Papers
Volume67
Issue number4
Online published1 Jan 2020
Publication statusPublished - Apr 2020

Abstract

Acceleration and compression on deep Convolutional Neural Networks (CNNs) have become a critical problem to develop intelligence on resource-constrained devices. Previous channel pruning can be easily deployed and accelerated without specialized hardware and software. However, weight-level redundancy is not well explored in channel pruning, which results in a relatively low compression ratio. In this work, we propose a Low-rank Approximated channel Pruning (LAP) framework to tackle this problem with two targeted steps. First, we utilize low-rank approximation to eliminate the redundancy within filter. This step achieves acceleration, especially in shallow layers, and also converts filters into smaller compact ones. Then, we apply channel pruning on the approximated network in a global way and obtain further benefits, especially in deep layers. In addition, we propose a spectral norm based indicator to coordinate these two steps better. Moreover, inspired by the integral idea adopted in video coding, we propose an evaluator based on Integral of Decay Curve (IDC) to judge the efficiency of various acceleration and compression algorithms. Ablation experiments and IDC evaluator prove that LAP can significantly improve channel pruning. To further demonstrate the hardware compatibility, the network produced by LAP obtains impressive speedup efficiency on the FPGA.

Research Area(s)

  • Deep learning, network acceleration, channel pruning, low-rank approximation, efficiency evaluation, hardware resources

Citation Format(s)

Deep Neural Network Acceleration Based on Low-Rank Approximated Channel Pruning. / Chen, Zhen; Chen, Zhibo; Lin, Jianxin; Liu, Sen; Li, Weiping.

In: IEEE Transactions on Circuits and Systems I: Regular Papers, Vol. 67, No. 4, 8948329, 04.2020, p. 1232-1244.

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review