Reinforcement learning-based QoE-oriented dynamic adaptive streaming framework

Research output: Journal Publications and Reviews (RGC: 21, 22, 62)21_Publication in refereed journalpeer-review

1 Scopus Citations
View graph of relations



Original languageEnglish
Pages (from-to)786-803
Journal / PublicationInformation Sciences
Online published13 May 2021
Publication statusPublished - Aug 2021


Dynamic adaptive streaming over the HTTP (DASH) standard has been widely adopted by many content providers for online video transmission and greatly improve the performance. Designing an efficient DASH system is challenging because of the inherent large fluctuations characterizing both encoded video sequences and network traces. In this paper, a reinforcement learning (RL)-based DASH technique that addresses user quality of experience (QoE) is constructed. The DASH adaptive bitrate (ABR) selection problem is formulated as a Markov decision process (MDP) problem. Accordingly, an RL-based solution is proposed to solve the MDP problem, in which the DASH clients act as the RL agent, and the network variation constitutes the environment. The proposed user QoE is used as the reward by jointly considering the video quality and buffer status. The goal of the RL algorithm is to select a suitable video quality level for each video segment to maximize the total reward. Then, the proposed RL-based ABR algorithm is embedded in the QoE-oriented DASH framework. Experimental results show that the proposed RL-based ABR algorithm outperforms state-of-the-art schemes in terms of both temporal and visual QoE factors by a noticeable margin while guaranteeing application-level fairness when multiple clients share a bottlenecked network.

Research Area(s)

  • Machine learning, MPEG-DASH, Quality of experience, Reinforcement learning