Enhanced Context Mining and Filtering for Learned Video Compression

Haifeng Guo, Sam Kwong*, Dongjie Ye, Shiqi Wang

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

12 Citations (Scopus)

Abstract

The Deep Contextual Video Compression framework (DCVC) utilizes a conditional coding paradigm, where the context is extracted and employed as a condition for the contextual encoder-decoder and entropy model. In this paper, we propose enhanced context mining and filtering to improve the compression efficiency of DCVC. Firstly, considering the context of DCVC is generated without supervision and redundancy may exist among context channels, an enhanced context mining model is proposed to mitigate redundancy across context channels to obtain superior context features. Then, we introduce a transformer-based enhancement network as a filtering module to capture long-distance dependencies and further enhance compression efficiency. The transformer-based enhancement adopts a full-resolution pipeline and calculates self-attention across channel dimensions. By combining the local modeling ability of the enhanced context mining model and the non-local modeling ability of the transformer-based enhancement network, our model outperforms LDP configurations of Versatile Video Coding (VVC), achieving an average bit savings of 6.7% in terms of MS-SSIM. © 2023 IEEE.
Original languageEnglish
Pages (from-to)3814-3826
JournalIEEE Transactions on Multimedia
Volume26
Online published18 Sept 2023
DOIs
Publication statusPublished - 2024

Funding

This work was supported in part by the Key Project of Science and Technology Innovation 2030 funded by the Ministry of Science and Technology of China under Grant 2018AAA0101301, in part by the Hong Kong Innovation and Technology Commission (InnoHK Project CIMDA), and in part by the Hong Kong GRFRGC General Research Fund under Grants 11203820, 9042598, 11209819, and CityU 9042816.

Research Keywords

  • Learned video compression
  • end-to-end training approach
  • enhanced context mining
  • in loop filtering

RGC Funding Information

  • RGC-funded

Fingerprint

Dive into the research topics of 'Enhanced Context Mining and Filtering for Learned Video Compression'. Together they form a unique fingerprint.

Cite this