On-demand Edge Inference Scheduling with Accuracy and Deadline Guarantee

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

1 Scopus Citations
View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publication2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS)
PublisherIEEE
Number of pages10
ISBN (Electronic)9798350399738
ISBN (Print)979-8-3503-9974-5
Publication statusPublished - 2023

Publication series

NameIEEE International Workshop on Quality of Service, IWQoS
ISSN (Print)1548-615X
ISSN (Electronic)2766-8568

Conference

Title31st IEEE/ACM International Symposium on Quality of Service (IWQoS 2023)
LocationOrlando World Center Marriott
PlaceUnited States
CityOrlando
Period19 - 21 June 2023

Abstract

To meet increasing demands for machine-learning-based applications, pushing inference services to the network edge has been a trend. This work aims to design an on-demand edge inference scheduler with accuracy and deadline guarantee for repetitive tasks. Specifically, we consider an edge server that is preinstalled with multiple early-exit Deep Neural Networks (DNNs), and each DNN-exit pair can provide inference service of different quality. We also consider tasks' diversity in quality of service requirements and related utility. We aim to maximize the system's total utility by optimizing service assignment and time scheduling subject to resource, accuracy, and deadline constraints. We present this problem's integer linear problem formulation and show this problem is NP-hard even for the offline case. This problem is challenging due to the coupled effect of service assignment and time scheduling. To derive low-complexity scheduling solutions, we introduce a task-service graph and convert this problem into a service assignment selection problem with schedulability constraints. Then, we design a polynomial complexity algorithm with ρδ-approximation ratio for the offline problem, with ρ referring to the task-wise utility ratio, δ referring to the maximum number of concurrent tasks. To handle the online problem, we propose an online heuristic algorithm. Simulation results show that the proposed algorithms outperform the state-of-the-art baseline algorithms. © 2023 IEEE.

Citation Format(s)

On-demand Edge Inference Scheduling with Accuracy and Deadline Guarantee. / She, Yechao; Li, Minming; Jin, Yang et al.
2023 IEEE/ACM 31st International Symposium on Quality of Service (IWQoS). IEEE, 2023. (IEEE International Workshop on Quality of Service, IWQoS).

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review