Exploiting Weight-Level Sparsity in Channel Pruning with Low-Rank Approximation

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review

6 Scopus Citations
View graph of relations

Author(s)

  • Jianxin Lin
  • Sen Liu
  • Zhibo Chen
  • Weiping Li
  • Jin Zhao
  • Wei Yan

Detail(s)

Original languageEnglish
Title of host publication2019 IEEE International Symposium on Circuits and Systems (ISCAS)
PublisherIEEE
ISBN (Print)978-1-7281-0397-6, 9781728103976
Publication statusPublished - May 2019
Externally publishedYes

Publication series

Name
ISSN (Print)2158-1525

Conference

Title2019 IEEE International Symposium on Circuits and Systems (ISCAS 2019)
PlaceJapan
CitySapporo
Period26 - 29 May 2019

Abstract

Acceleration and compression on Deep Neural Networks (DNNs) have become a critical problem to develop intelligence on resource-constrained hardware, especially on Internet of Things (IoT) devices. Previous works based on channel pruning can be easily deployed and accelerated without specialized hardware and software. However, weight-level sparsity is not well explored in channel pruning, which results in relatively low compression rate. In this work, we propose a framework that combines channel pruning with low-rank decomposition to tackle this problem. First, the low-rank decomposition is utilized to eliminate redundancy within filter, and achieves acceleration in shallow layers. Then, we apply channel pruning on the decomposed network in a global way, and obtains further acceleration in deep layers. In addition, a spectral norm-based indicator is proposed to balance low-rank approximation and channel pruning. We conduct a series of ablation experiments and prove that low-rank decomposition can effectively improve channel pruning by generating small and compact filters. To further demonstrate the hardware compatibility, we deploy the pruned networks on the FPGA, and the networks produced by our method have obviously low latency.

Research Area(s)

  • Terms—deep learning, network acceleration, channel pruning, low-rank decomposition, hardware resources

Citation Format(s)

Exploiting Weight-Level Sparsity in Channel Pruning with Low-Rank Approximation. / Chen, Zhen; Lin, Jianxin; Liu, Sen; Chen, Zhibo; Li, Weiping; Zhao, Jin; Yan, Wei.

2019 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, 2019. 8702429.

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review