A Kernel Unfolding Approach to Trade Data Movement with Computation Power for CNN Acceleration
Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45) › 32_Refereed conference paper (with ISBN/ISSN) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | PROCEEDINGS: 9th IEEE Non-Volatile Memory Systems and Applications Symposium — NVMSA 2020 — |
Publisher | IEEE |
ISBN (Electronic) | 9781728184821 |
ISBN (Print) | 9781728184838 |
Publication status | Published - Aug 2020 |
Publication series
Name | Proceedings - IEEE Non-Volatile Memory Systems and Applications Symposium, NVMSA |
---|---|
ISSN (Print) | 2575-2561 |
ISSN (Electronic) | 2575-257X |
Conference
Title | 9th IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA 2020) |
---|---|
Location | Virtual |
Place | Korea, Republic of |
City | Seoul |
Period | 19 - 21 August 2020 |
Link(s)
Abstract
Convolutional neural networks (CNN) achieves human-level accuracy on the image classification applications. However, its complicated structure brings the large requirement on the MAC operations and result in huge cost on the data movement. In addition, this situation becomes worse when the asymmetric growth of the computing power and memory speed happens on the von Neumann-based architecture. Recently, processing in memory (PIM) design is adopted to reduce the data communication cost by storing parameters into memory. However, significant cost on feeding input feature map is a big challenge, especially for high bandwidth but long access latency PIM devices. Thus, we explore an idea that how to trade the space in PIM to eliminate such cost. A kernel unfolding technique is proposed to eliminate the duplicated feeding on input feature map, and meanwhile, memory cells in PIM are highly utilized to achieve peak computing throughput. Thus, the memory bandwidth could be utilized efficiently and the corresponding execution time could be reduced significantly. The results show that the proposed design could achieve up to 16.2× cycle improvement compared to traditional PIM designs.
Citation Format(s)
A Kernel Unfolding Approach to Trade Data Movement with Computation Power for CNN Acceleration. / Wu, Yueh-Han; Wang, Tse-Yuan; Chang, Yuan-Hao; Kuo, Tei-Wei; Chang, Hung-Sheng.
PROCEEDINGS: 9th IEEE Non-Volatile Memory Systems and Applications Symposium — NVMSA 2020 —. IEEE, 2020. 9188176 (Proceedings - IEEE Non-Volatile Memory Systems and Applications Symposium, NVMSA).Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45) › 32_Refereed conference paper (with ISBN/ISSN) › peer-review