RM-SSD : In-Storage Computing for Large-Scale Recommendation Inference
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings - 2022 IEEE International Symposium on High -Performance Computer Architecture (HPCA 2022) |
Publisher | Institute of Electrical and Electronics Engineers, Inc. |
Pages | 1056-1070 |
ISBN (electronic) | 978-1-6654-2027-3 |
Publication status | Published - 2022 |
Publication series
Name | Proceedings - International Symposium on High-Performance Computer Architecture |
---|---|
Volume | 2022-April |
ISSN (Print) | 1530-0897 |
Conference
Title | 28th IEEE International Symposium on High-Performance Computer Architecture (HPCA 2022) |
---|---|
Location | Virtual |
Place | Korea, Republic of |
City | Seoul |
Period | 2 - 6 April 2022 |
Link(s)
DOI | DOI |
---|---|
Attachment(s) | Documents
Publisher's Copyright Statement
|
Link to Scopus | https://www.scopus.com/record/display.uri?eid=2-s2.0-85130735559&origin=recordpage |
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(0524253e-eb80-4d8e-9f6f-6e2a2a53accb).html |
Abstract
To meet the strict service level agreement requirements of recommendation systems, the entire set of embeddings in recommendation systems needs to be loaded into the memory. However, as the model and dataset for production-scale recommendation systems scale up, the size of the embeddings is approaching the limit of memory capacity. Limited physical memory constrains the algorithms that can be trained and deployed, posing a severe challenge for deploying advanced recommendation systems. Recent studies offload the embedding lookups into SSDs, which targets the embedding-dominated recommendation models. This paper takes it one step further and proposes to offload the entire recommendation system into SSD with in-storage computing capability. The proposed SSD-side FPGA solution leverages a low-end FPGA to speed up both the embedding-dominated and MLP-dominated models with high resource efficiency. We evaluate the performance of the proposed solution with a prototype SSD. Results show that we can achieve 20-100x throughput improvement compared with the baseline SSD and 1.5-15x improvement compared with the state-of-art.
Research Area(s)
Citation Format(s)
RM-SSD: In-Storage Computing for Large-Scale Recommendation Inference. / Sun, Xuan; Wan, Hu; Li, Qiao et al.
Proceedings - 2022 IEEE International Symposium on High -Performance Computer Architecture (HPCA 2022). Institute of Electrical and Electronics Engineers, Inc., 2022. p. 1056-1070 (Proceedings - International Symposium on High-Performance Computer Architecture; Vol. 2022-April).
Proceedings - 2022 IEEE International Symposium on High -Performance Computer Architecture (HPCA 2022). Institute of Electrical and Electronics Engineers, Inc., 2022. p. 1056-1070 (Proceedings - International Symposium on High-Performance Computer Architecture; Vol. 2022-April).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Download Statistics
No data available