TY - GEN
T1 - A 512Gb In-Memory-Computing 3D-NAND Flash Supporting Similar-Vector-Matching Operations on Edge-AI Devices
AU - Hu, Han-Wen
AU - Wang, Wei-Chen
AU - Chen, Chung-Kuang
AU - Lee, Yung-Chun
AU - Lin, Bo-Rong
AU - Wang, Huai-Mu
AU - Lin, Yen-Po
AU - Lin, Yu-Chao
AU - Hsieh, Chih-Chang
AU - Hu, Chia-Ming
AU - Lai, Yi-Ting
AU - Chen, Han-Sung
AU - Chang, Yuan-Hao
AU - Li, Hsiang-Pang
AU - Kuo, Tei-Wei
AU - Wang, Keh-Chung
AU - Chang, Meng-Fan
AU - Hung, Chun-Hsiung
AU - Lu, Chin-Yuan
PY - 2022/2
Y1 - 2022/2
N2 - Similar-vector-matching (SVM) applications for unstructured vectors that are generated via machine-learning methods, such as face search and audio texturing from a dataset for access control systems, are frequently operated on edge devices, as depicted in Fig. 7.5.1. The SVM operation [1]-[3] typically comprises of (1) in the offline phase, the extracted raw vectors (VRAW) are obtained from machine learning approaches and stored in non-volatile NAND Flash; (2) in the online phase, a processor request VRAW data from edge storage; (3) the entire VRAW dataset is moved from storage to the processor; (4) the processor scores the similarities between an input query and each candidate VRAW and provide a best match. However, the large-amount data movement across the memory hierarchy consumes a large amount of energy (EMEM), while also resulting in a long search-latency (tSR) for SVM operations. The entire VRAW dataset includes a large amount of invalid data. To reducing data movement will lower EMEM and tSR; edge storage with nonvolatile computing-in-memory (nvCIM) support for similarity computation (vector-vector multiplication (VVM) for cosine similarity) is required to reduce the VRAW dataset to a small candidate size. However, there are challenges in leveraging 3D NAND for VVM operations: (1) a low-readout accuracy when there is a large amount of current summation by using the wide range Vt-level of cells (e.g., 1st to 4th Vt-level of TLC cell) and (2) the large readout power consumption required to achieve a constant settling time against a wide range of summation currents for the possible data-patterns.
AB - Similar-vector-matching (SVM) applications for unstructured vectors that are generated via machine-learning methods, such as face search and audio texturing from a dataset for access control systems, are frequently operated on edge devices, as depicted in Fig. 7.5.1. The SVM operation [1]-[3] typically comprises of (1) in the offline phase, the extracted raw vectors (VRAW) are obtained from machine learning approaches and stored in non-volatile NAND Flash; (2) in the online phase, a processor request VRAW data from edge storage; (3) the entire VRAW dataset is moved from storage to the processor; (4) the processor scores the similarities between an input query and each candidate VRAW and provide a best match. However, the large-amount data movement across the memory hierarchy consumes a large amount of energy (EMEM), while also resulting in a long search-latency (tSR) for SVM operations. The entire VRAW dataset includes a large amount of invalid data. To reducing data movement will lower EMEM and tSR; edge storage with nonvolatile computing-in-memory (nvCIM) support for similarity computation (vector-vector multiplication (VVM) for cosine similarity) is required to reduce the VRAW dataset to a small candidate size. However, there are challenges in leveraging 3D NAND for VVM operations: (1) a low-readout accuracy when there is a large amount of current summation by using the wide range Vt-level of cells (e.g., 1st to 4th Vt-level of TLC cell) and (2) the large readout power consumption required to achieve a constant settling time against a wide range of summation currents for the possible data-patterns.
UR - http://www.scopus.com/inward/record.url?scp=85128258111&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85128258111&origin=recordpage
U2 - 10.1109/ISSCC42614.2022.9731775
DO - 10.1109/ISSCC42614.2022.9731775
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9781665428019
T3 - Digest of Technical Papers - IEEE International Solid-State Circuits Conference
SP - 138
EP - 140
BT - 2022 IEEE International Solid-State Circuits Conference, ISSCC 2022
PB - IEEE
T2 - 2022 IEEE International Solid-State Circuits Conference (ISSCC 2022)
Y2 - 20 February 2022 through 26 February 2022
ER -