A No-Reference Quality Assessment Model for Screen Content Videos via Hierarchical Spatiotemporal Perception

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

View graph of relations

Author(s)

  • Zhihong Liu
  • Huanqiang Zeng
  • Jing Chen
  • Rui Ding
  • Yifan Shi

Detail(s)

Original languageEnglish
Article number10697204
Number of pages15
Journal / PublicationIEEE Transactions on Circuits and Systems for Video Technology
Publication statusOnline published - 27 Sept 2024

Abstract

In this paper, a novel deep learning-based no-reference video quality assessment (NR-VQA) model for screen content videos (SCVs) is proposed, called the hierarchical spatiotemporal perceptual quality model (HSPQ). Firstly, the human visual system (HVS) perceives SCVs hierarchically, with varying sensitivity and attention to diverse attribute regions. Secondly, the visual redundancies are copious in the spatiotemporal domain of SCVs, degrading video quality to some extent. Based on these characteristics, the SCVs are decomposed into three hierarchical levels (i.e., patch level, frame level, and video level), which contain quality-related spatiotemporal information. Specifically, the visual saliency is first utilized for more salient textual and pictorial patches selection, and then, a dual-channel convolutional neural network integrating spatial-gate feature enhancement module (SGFEM) is designed to evaluate the quality of patches based on their attributes at the patch level separately. With spatial correlation, an adaptive blur-focused visual mechanism-based weighting strategy (BFWS) is proposed for converting quality scores from patch level to frame level. Finally, the video-level quality score, which reflects the temporal perceptual quality degradation, is combined to provide a comprehensive evaluation of distorted SCV quality. Experiments conducted on the Screen Content Video Database (SCVD) and Compressed Screen Content Video Quality (CSCVQ) databases demonstrate that our proposed HSPQ model aligns better with the visual perception of SCVs by the HVS. Moreover, it exhibits strong robustness compared to multiple classic and state-of-the-art image/video quality assessment models.

© 2024 IEEE.

Research Area(s)

  • No-reference, video quality assessment, screen content video, visual saliency, hierarchical perception

Bibliographic Note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Citation Format(s)