Abstract
In recent years, Deep Neural Network (DNN) has been increasingly adopted by a wide range of time-critical applications running on edge platforms with heterogeneous multiprocessors. To meet the stringent timing requirements of these applications, heterogeneous CPU and GPU resources must be efficiently utilized for the inference of multiple DNN models. Such a cross-processor real-time DNN inference paradigm poses major challenges due to the inherent performance imbalance among different processors and the lack of real-time support for cross-processor inference from existing deep learning frameworks. In this work, we propose a new system named BlastNet that exploits duo-block - a new model inference abstraction to support highly efficient cross-processor real-time DNN inference. Each duo-block has a dual model structure, enabling efficient fine-grained inference alternatively across different processors. BlastNet employs a novel block-level Neural Architecture Search (NAS) technique to generate duo-blocks, which accounts for computing characteristics and communication overhead. The duo-blocks are optimized at design time and then dynamically scheduled to achieve high resource utilization of heterogeneous CPU and GPU at runtime. BlastNet is implemented on an indoor autonomous driving platform and three popular edge platforms. Extensive results show that BlastNet achieves 35.07 % less deadline missing rate with a mere 1.63% of model accuracy loss. © 2022 ACM.
| Original language | English |
|---|---|
| Title of host publication | SenSys 2022 - Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems |
| Publisher | Association for Computing Machinery |
| Pages | 91-105 |
| ISBN (Print) | 9781450398862 |
| DOIs | |
| Publication status | Published - Nov 2022 |
| Event | 20th ACM Conference on Embedded Networked Sensor Systems (SenSys 2022) - Hynes Convention Center, Boston, United States Duration: 6 Nov 2022 → 9 Nov 2022 https://sensys.acm.org/2022/ |
Publication series
| Name | SenSys - Proceedings of the ACM Conference on Embedded Networked Sensor Systems |
|---|
Conference
| Conference | 20th ACM Conference on Embedded Networked Sensor Systems (SenSys 2022) |
|---|---|
| Place | United States |
| City | Boston |
| Period | 6/11/22 → 9/11/22 |
| Internet address |
Bibliographical note
Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).Funding
The work described in this article was partially supported by the Research Grants Council (RGC)-General Research Fund under Grant No. 14209619 and Grant No. 14203420.
Research Keywords
- CPU-GPU heterogeneous platform
- edge artificial intelligence
- multi-DNN concurrent execution
- neural architecture search
- on-device deep learning
- real-time scheduling
RGC Funding Information
- RGC-funded