Spatiotemporal Feature Hierarchy-Based Blind Prediction of Natural Video Quality via Transfer Learning

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

6 Scopus Citations
View graph of relations

Author(s)

  • Weizhi Xian
  • Mingliang Zhou
  • Bin Fang
  • Cheng Ji
  • Tao Xiang
  • Weijia Jia

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)130-143
Number of pages14
Journal / PublicationIEEE Transactions on Broadcasting
Volume69
Issue number1
Online published3 Aug 2022
Publication statusPublished - Mar 2023

Abstract

In this paper, we propose a pyramidal spatiotemporal feature hierarchy (PSFH)-based no-reference (NR) video quality assessment (VQA) method using transfer learning. First, we generate simulated videos by a generative adversarial network (GAN)-based image restoration model. The residual maps between the distorted frames and simulated frames, which can capture rich information, are utilized as one input of the quality regression network. Second, we use 3D convolution operations to construct a PSFH network with five stages. The spatiotemporal features incorporating the shared features transferred from the pretrained image restoration model are fused stage by stage. Third, with the guidance of the transferred knowledge, each stage generates multiple feature mapping layers that encode different semantics and degradation information using 3D convolution layers and gated recurrent units (GRUs). Finally, five approximate perceptual quality scores and a precise prediction score are obtained by fully connected (FC) networks. The whole model is trained under a finely designed loss function that combines pseudo-Huber loss and Pearson linear correlation coefficient (PLCC) loss to improve the robustness and prediction accuracy. According to the extensive experiments, outstanding results can be obtained compared with other state-of-the-art methods. Both the source code and models are available online.

Research Area(s)

  • 3D convolution, Distortion, Feature extraction, generative adversarial network, Image restoration, pyramidal spatiotemporal feature, Quality assessment, Spatiotemporal phenomena, Three-dimensional displays, transfer learning, Video quality assessment

Citation Format(s)

Spatiotemporal Feature Hierarchy-Based Blind Prediction of Natural Video Quality via Transfer Learning. / Xian, Weizhi; Zhou, Mingliang; Fang, Bin et al.
In: IEEE Transactions on Broadcasting, Vol. 69, No. 1, 03.2023, p. 130-143.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review