Abstract
There is a growing demand to deploy computation-intensive deep learning (DL) models on resource-constrained mobile devices for real-time intelligent applications. Equipped with a variety of processing units such as CPUs, GPUs, and NPUs, the mobile devices hold potential to accelerate DL inference via parallel execution across heterogeneous processors. Various efficient parallel methods have been explored to optimize computation distribution, achieve load balance, and minimize communication cost across processors. Yet their practical effectiveness in the dynamic and diverse real-world mobile environment is less explored. This paper presents a holistic empirical study to assess the capabilities and challenges associated with parallel DL inference on heterogeneous mobile processors. Through carefully designed experiments covering various DL models, mobile software/hardware environments, workload patterns, and resource availability, we identify limitations of existing techniques and highlight opportunities for cross-level optimization. © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.
| Original language | English |
|---|---|
| Title of host publication | AdaAIoTSys '24: Proceedings of the 2024 Workshop on Adaptive AIoT Systems |
| Publisher | Association for Computing Machinery |
| Pages | 1-6 |
| ISBN (Print) | 9798400706646 |
| DOIs | |
| Publication status | Published - Jun 2024 |
| Event | 2024 Workshop on Adaptive AIoT Systems (AdaAIoTSys 2024) Co-located with 22nd ACM International Conference on Mobile Systems, Applications, and Services (MobiSys 2024) - Toramon Hill Forum, Minato-ku, Japan Duration: 3 Jun 2024 → 7 Jun 2024 https://www.sigmobile.org/mobisys/2024/wsl.html |
Publication series
| Name | AdaAIoTSys - Proceedings of the AdaAIoTSys - Workshop on Adaptive AIoT Systems |
|---|
Conference
| Conference | 2024 Workshop on Adaptive AIoT Systems (AdaAIoTSys 2024) Co-located with 22nd ACM International Conference on Mobile Systems, Applications, and Services (MobiSys 2024) |
|---|---|
| Place | Japan |
| City | Minato-ku |
| Period | 3/06/24 → 7/06/24 |
| Internet address |
Bibliographical note
Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).Funding
This work was supported by the National Science Fund for Distinguished Young Scholars (62025205), the National Natural Science Foundation of China (No. 62032020, 62102317), and CityU APRC grant (No. 9610633).
Research Keywords
- Heterogeneous processors
- parallel DL inference