Binary convolutional neural network acceleration framework for rapid system prototyping

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

15 Scopus Citations
View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Article number101762
Journal / PublicationJournal of Systems Architecture
Volume109
Online published14 Mar 2020
Publication statusPublished - Oct 2020

Abstract

The huge model size and high computational complexity make emerging convolutional neural network (CNN) models unsuitable to deploy on current embedded or edge computing devices. Recently the binary neural network (BNN) is explored to help reduce network model size and avoid complex multiplication. In this paper, a binary network acceleration framework for rapid system prototyping is proposed to promote the deployment of CNNs on embedded devices. Firstly trainable scaling factors are adopted in binary network training to improve network accuracy performance. The hardware/software co-design framework supports various compact network structures such as residual block, 1 × 1 squeeze convolution layer, and depthwise separable convolution. With flexible network binarization and efficient hardware architecture optimization, the acceleration system is able to achieve over 2 TOPS throughput performance comparable to modern desktop GPU with much higher power efficiency.

Research Area(s)

  • Binarization, Convolutional neural network, FPGA, Hardware acceleration, Rapid system prototyping