BlastNet: Exploiting Duo-Blocks for Cross-Processor Real-Time DNN Inference

Neiwen Ling, Xuan Huang, Zhihe Zhao, Nan Guan, Zhenyu Yan, Guoliang Xing*

*Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

37 Citations (Scopus)

Abstract

In recent years, Deep Neural Network (DNN) has been increasingly adopted by a wide range of time-critical applications running on edge platforms with heterogeneous multiprocessors. To meet the stringent timing requirements of these applications, heterogeneous CPU and GPU resources must be efficiently utilized for the inference of multiple DNN models. Such a cross-processor real-time DNN inference paradigm poses major challenges due to the inherent performance imbalance among different processors and the lack of real-time support for cross-processor inference from existing deep learning frameworks. In this work, we propose a new system named BlastNet that exploits duo-block - a new model inference abstraction to support highly efficient cross-processor real-time DNN inference. Each duo-block has a dual model structure, enabling efficient fine-grained inference alternatively across different processors. BlastNet employs a novel block-level Neural Architecture Search (NAS) technique to generate duo-blocks, which accounts for computing characteristics and communication overhead. The duo-blocks are optimized at design time and then dynamically scheduled to achieve high resource utilization of heterogeneous CPU and GPU at runtime. BlastNet is implemented on an indoor autonomous driving platform and three popular edge platforms. Extensive results show that BlastNet achieves 35.07 % less deadline missing rate with a mere 1.63% of model accuracy loss. © 2022 ACM.
Original languageEnglish
Title of host publicationSenSys 2022 - Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems
PublisherAssociation for Computing Machinery
Pages91-105
ISBN (Print)9781450398862
DOIs
Publication statusPublished - Nov 2022
Event20th ACM Conference on Embedded Networked Sensor Systems (SenSys 2022) - Hynes Convention Center, Boston, United States
Duration: 6 Nov 20229 Nov 2022
https://sensys.acm.org/2022/

Publication series

NameSenSys - Proceedings of the ACM Conference on Embedded Networked Sensor Systems

Conference

Conference20th ACM Conference on Embedded Networked Sensor Systems (SenSys 2022)
PlaceUnited States
CityBoston
Period6/11/229/11/22
Internet address

Bibliographical note

Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).

Funding

The work described in this article was partially supported by the Research Grants Council (RGC)-General Research Fund under Grant No. 14209619 and Grant No. 14203420.

Research Keywords

  • CPU-GPU heterogeneous platform
  • edge artificial intelligence
  • multi-DNN concurrent execution
  • neural architecture search
  • on-device deep learning
  • real-time scheduling

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'BlastNet: Exploiting Duo-Blocks for Cross-Processor Real-Time DNN Inference'. Together they form a unique fingerprint.

Cite this