Binary convolutional neural network acceleration framework for rapid system prototyping
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Article number | 101762 |
Journal / Publication | Journal of Systems Architecture |
Volume | 109 |
Online published | 14 Mar 2020 |
Publication status | Published - Oct 2020 |
Link(s)
Abstract
The huge model size and high computational complexity make emerging convolutional neural network (CNN) models unsuitable to deploy on current embedded or edge computing devices. Recently the binary neural network (BNN) is explored to help reduce network model size and avoid complex multiplication. In this paper, a binary network acceleration framework for rapid system prototyping is proposed to promote the deployment of CNNs on embedded devices. Firstly trainable scaling factors are adopted in binary network training to improve network accuracy performance. The hardware/software co-design framework supports various compact network structures such as residual block, 1 × 1 squeeze convolution layer, and depthwise separable convolution. With flexible network binarization and efficient hardware architecture optimization, the acceleration system is able to achieve over 2 TOPS throughput performance comparable to modern desktop GPU with much higher power efficiency.
Research Area(s)
- Binarization, Convolutional neural network, FPGA, Hardware acceleration, Rapid system prototyping
Citation Format(s)
Binary convolutional neural network acceleration framework for rapid system prototyping. / Xu, Zhe; Cheung, Ray C.C.
In: Journal of Systems Architecture, Vol. 109, 101762, 10.2020.
In: Journal of Systems Architecture, Vol. 109, 101762, 10.2020.
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review