Skip to main navigation Skip to search Skip to main content

On-demand deep model compression for mobile devices: A usage-driven model selection framework

  • Sicong Liu
  • , Kaiming Nan
  • , Yingyan Lin
  • , Hui Liu
  • , Zimu Zhou
  • , Junzhao Du*
  • *Corresponding author for this work

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

Abstract

Recent research has demonstrated the potential of deploying deep neural networks (DNNs) on resource-constrained mobile platforms by trimming down the network complexity using different compression techniques. The current practice only investigate stand-alone compression schemes even though each compression technique may be well suited only for certain types of DNN layers. Also, these compression techniques are optimized merely for the inference accuracy of DNNs, without explicitly considering other application-driven system performance (e.g. latency and energy cost) and the varying resource availabilities across platforms (e.g. storage and processing capability). In this paper, we explore the desirable tradeoff between performance and resource constraints by user-specified needs, from a holistic system-level viewpoint. Specifically, we develop a usage-driven selection framework, referred to as AdaDeep, to automatically select a combination of compression techniques for a given DNN, that will lead to an optimal balance between user-specified performance goals and resource constraints. With an extensive evaluation on five public datasets and across twelve mobile devices, experimental results show that AdaDeep enables up to 9.8× latency reduction, 4.3× energy efficiency improvement, and 38× storage reduction in DNNs while incurring negligible accuracy loss. AdaDeep also uncovers multiple effective combinations of compression techniques unexplored in existing literature. © 2018 Association for Computing Machinery.
Original languageEnglish
Title of host publicationMobiSys 2018 - Proceedings of the 16th ACM International Conference on Mobile Systems, Applications, and Services
PublisherAssociation for Computing Machinery
Pages389-400
ISBN (Print)9781450357203
DOIs
Publication statusPublished - 10 Jun 2018
Externally publishedYes
Event16th ACM International Conference on Mobile Systems, Applications, and Services,MobiSys 2018 - Munich, Germany
Duration: 10 Jun 201815 Jun 2018

Publication series

NameMobiSys 2018 - Proceedings of the 16th ACM International Conference on Mobile Systems, Applications, and Services

Conference

Conference16th ACM International Conference on Mobile Systems, Applications, and Services,MobiSys 2018
PlaceGermany
CityMunich
Period10/06/1815/06/18

Bibliographical note

Publication details (e.g. title, author(s), publication statuses and dates) are captured on an “AS IS” and “AS AVAILABLE” basis at the time of record harvesting from the data source. Suggestions for further amendments or supplementary information can be sent to [email protected].

Funding

This work is supported in part by National Key Research & Development Program of China #2018YFB1003605, Natural Science Foundation of China (NSFC) #61472312, and Natural Science Foundation (NSF) Award #1611295.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 7 - Affordable and Clean Energy
    SDG 7 Affordable and Clean Energy

Research Keywords

  • Deep learning
  • Deep reinforcement learning
  • Model compression

Fingerprint

Dive into the research topics of 'On-demand deep model compression for mobile devices: A usage-driven model selection framework'. Together they form a unique fingerprint.

Cite this