TY - GEN
T1 - Overcoming Memory Constraint for Improved Target Classification Performance on Embedded Deep Learning Systems
AU - Wu, Fan
AU - Liu, Huanghe
AU - Zhu, Zongwei
AU - Ji, Cheng
AU - Xue, Chun Jason
N1 - Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).
PY - 2020/12
Y1 - 2020/12
N2 - Pattern recognition applications such as face recognition, detection of broken eggs, and classification of agricultural products are all using image classification in deep neural networks to improve the quality of services. However, traditional cloud inference models suffer from several problems such as network delay fluctuations and privacy leakage. In this regard, most real-Time applications currently need to be deployed on edge computing devices. Constrained by the computing power and memory limitations of edge devices, the use of an efficient memory manager for model reasoning is the key to improving the quality of service. This study firstly explored the incremental loading strategy of model weights for the model reasoning. Next, the memory space at runtime is optimized through data layout reorganization from the spatial dimension. In particular, our proposed schemes are orthogonal and transparent to the model. Experimental results demonstrate that the proposed approach reduced the memory consumption by 43.74% on average without additional reasoning time overhead.
AB - Pattern recognition applications such as face recognition, detection of broken eggs, and classification of agricultural products are all using image classification in deep neural networks to improve the quality of services. However, traditional cloud inference models suffer from several problems such as network delay fluctuations and privacy leakage. In this regard, most real-Time applications currently need to be deployed on edge computing devices. Constrained by the computing power and memory limitations of edge devices, the use of an efficient memory manager for model reasoning is the key to improving the quality of service. This study firstly explored the incremental loading strategy of model weights for the model reasoning. Next, the memory space at runtime is optimized through data layout reorganization from the spatial dimension. In particular, our proposed schemes are orthogonal and transparent to the model. Experimental results demonstrate that the proposed approach reduced the memory consumption by 43.74% on average without additional reasoning time overhead.
KW - Deep learning reasoning
KW - Edge computing
KW - Memory management
UR - http://www.scopus.com/inward/record.url?scp=85105342760&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85105342760&origin=recordpage
U2 - 10.1109/HPCC-SmartCity-DSS50907.2020.00081
DO - 10.1109/HPCC-SmartCity-DSS50907.2020.00081
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 9781728176499
T3 - Proceedings - IEEE International Conference on High Performance Computing and Communications, IEEE International Conference on Smart City and IEEE International Conference on Data Science and Systems, HPCC-SmartCity-DSS
SP - 634
EP - 639
BT - Proceedings - 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems, HPCC-SmartCity-DSS 2020
PB - IEEE
T2 - 22nd IEEE International Conference on High Performance Computing and Communications, 18th IEEE International Conference on Smart City and 6th IEEE International Conference on Data Science and Systems (HPCC-SmartCity-DSS 2020)
Y2 - 14 December 2020 through 16 December 2020
ER -