Resource Optimization for Emerging Multimedia Systems
新興多媒體系統的資源優化
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 28 Nov 2023 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(5ec991ee-2f65-4850-9888-82b8dbe39657).html |
---|---|
Other link(s) | Links |
Abstract
Multimedia systems have become an integral part of our daily lives and played a significant role in various applications that deeply impact our experiences. These applications include video on demand, live video streaming, and video conferencing, which are now intertwined with everyone's daily routines. In addition to these traditional applications, we are also witnessing a continuous surge in emerging technologies such as augmented and virtual reality (AR/VR), short video streaming, and video analytics. These advancements are propelled by the rapid development of deep neural networks (DNNs) and the growing demands of real-world scenarios. As a result, multimedia is undergoing a transformative shift in both user interaction and system development. Existing techniques developed for traditional multimedia systems are insufficient for next-generation technologies, and thus require us to introduce new ways to optimize and innovate system designs.
The main challenge of building efficient and high-quality emerging multimedia systems lies in dealing with the conflict between application demand and the resource-constrained environment. In this thesis, we focus on three emerging multimedia applications, i.e., short video streaming, video analytics, and point cloud analytics. We tackle the above challenge by introducing three novel systems:
Alfie, a short video streaming system tailored for providing bandwidth-efficient prefetching and high quality of experience (QoE). We comprehensively investigate the significant bandwidth overhead of the existing prefetching scheme and propose a novel adaptive prefetching algorithm based on deep reinforcement learning (DRL). Alfie is able to adapt to dynamic user behaviors and network conditions by learning from past experiences, thus achieving a good policy that minimizes bandwidth consumption as well as offering QoE gain.
Polly, a cross-camera video analytics system that enables inference sharing on constrained edge devices. Polly explores the overlapping fields of views (FoVs) of co-located cameras as a new dimension to optimize edge-based video analytics. It saves resources and accelerates inference by directly mapping the detected objects from the reference to the target camera instead of running the DNN model on the overlapping FoVs redundantly.
Moby, a novel point cloud analytics system that enables 3D object detection on edge devices in real-time. Rather than relying on heavy DNN-based 3D detectors, Moby proposes a lightweight 2D-to-3D transformation method that can easily run on the edge device. Moby also addresses the limitations of edge-only and cloud-only inference while maintaining high accuracy by scheduling only a subset of frames offloaded to the server for inference.
In summary, this thesis introduces a collection of resource optimization approaches specifically designed for emerging multimedia systems. The approaches include Alfie, which focuses on network optimization; Polly and Moby, which concentrate on computation optimization. Through complete implementations and comprehensive evaluations, these approaches have shown a substantial enhancement in resource utilization.
The main challenge of building efficient and high-quality emerging multimedia systems lies in dealing with the conflict between application demand and the resource-constrained environment. In this thesis, we focus on three emerging multimedia applications, i.e., short video streaming, video analytics, and point cloud analytics. We tackle the above challenge by introducing three novel systems:
Alfie, a short video streaming system tailored for providing bandwidth-efficient prefetching and high quality of experience (QoE). We comprehensively investigate the significant bandwidth overhead of the existing prefetching scheme and propose a novel adaptive prefetching algorithm based on deep reinforcement learning (DRL). Alfie is able to adapt to dynamic user behaviors and network conditions by learning from past experiences, thus achieving a good policy that minimizes bandwidth consumption as well as offering QoE gain.
Polly, a cross-camera video analytics system that enables inference sharing on constrained edge devices. Polly explores the overlapping fields of views (FoVs) of co-located cameras as a new dimension to optimize edge-based video analytics. It saves resources and accelerates inference by directly mapping the detected objects from the reference to the target camera instead of running the DNN model on the overlapping FoVs redundantly.
Moby, a novel point cloud analytics system that enables 3D object detection on edge devices in real-time. Rather than relying on heavy DNN-based 3D detectors, Moby proposes a lightweight 2D-to-3D transformation method that can easily run on the edge device. Moby also addresses the limitations of edge-only and cloud-only inference while maintaining high accuracy by scheduling only a subset of frames offloaded to the server for inference.
In summary, this thesis introduces a collection of resource optimization approaches specifically designed for emerging multimedia systems. The approaches include Alfie, which focuses on network optimization; Polly and Moby, which concentrate on computation optimization. Through complete implementations and comprehensive evaluations, these approaches have shown a substantial enhancement in resource utilization.