A Kernel Unfolding Approach to Trade Data Movement with Computation Power for CNN Acceleration

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review

View graph of relations

Author(s)

  • Yueh-Han Wu
  • Tse-Yuan Wang
  • Yuan-Hao Chang
  • Tei-Wei Kuo
  • Hung-Sheng Chang

Detail(s)

Original languageEnglish
Title of host publicationPROCEEDINGS: 9th IEEE Non-Volatile Memory Systems and Applications Symposium — NVMSA 2020 —
PublisherIEEE
ISBN (Electronic)9781728184821
ISBN (Print)9781728184838
Publication statusPublished - Aug 2020

Publication series

NameProceedings - IEEE Non-Volatile Memory Systems and Applications Symposium, NVMSA
ISSN (Print)2575-2561
ISSN (Electronic)2575-257X

Conference

Title9th IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA 2020)
LocationVirtual
PlaceKorea, Republic of
CitySeoul
Period19 - 21 August 2020

Abstract

Convolutional neural networks (CNN) achieves human-level accuracy on the image classification applications. However, its complicated structure brings the large requirement on the MAC operations and result in huge cost on the data movement. In addition, this situation becomes worse when the asymmetric growth of the computing power and memory speed happens on the von Neumann-based architecture. Recently, processing in memory (PIM) design is adopted to reduce the data communication cost by storing parameters into memory. However, significant cost on feeding input feature map is a big challenge, especially for high bandwidth but long access latency PIM devices. Thus, we explore an idea that how to trade the space in PIM to eliminate such cost. A kernel unfolding technique is proposed to eliminate the duplicated feeding on input feature map, and meanwhile, memory cells in PIM are highly utilized to achieve peak computing throughput. Thus, the memory bandwidth could be utilized efficiently and the corresponding execution time could be reduced significantly. The results show that the proposed design could achieve up to 16.2× cycle improvement compared to traditional PIM designs.

Citation Format(s)

A Kernel Unfolding Approach to Trade Data Movement with Computation Power for CNN Acceleration. / Wu, Yueh-Han; Wang, Tse-Yuan; Chang, Yuan-Hao; Kuo, Tei-Wei; Chang, Hung-Sheng.

PROCEEDINGS: 9th IEEE Non-Volatile Memory Systems and Applications Symposium — NVMSA 2020 —. IEEE, 2020. 9188176 (Proceedings - IEEE Non-Volatile Memory Systems and Applications Symposium, NVMSA).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review