TY - JOUR
T1 - Joint Graph Attention and Asymmetric Convolutional Neural Network for Deep Image Compression
AU - Tang, Zhisen
AU - Wang, Hanli
AU - Yi, Xiaokai
AU - Zhang, Yun
AU - Kwong, Sam
AU - Kuo, C.-C. Jay
PY - 2023/1
Y1 - 2023/1
N2 - Recent deep image compression methods have achieved prominent progress by using nonlinear modeling and powerful representation capabilities of neural networks. However, most existing learning-based image compression approaches employ customized convolutional neural network (CNN) to utilize visual features by treating all pixels equally, neglecting the effect of local key features. Meanwhile, the convolutional filters in CNN usually express the local spatial relationship within the receptive field and seldom consider the long-range dependencies from distant locations. This results in the long-range dependencies of latent representations not being fully compressed. To address these issues, an end-to-end image compression method is proposed by integrating graph attention and asymmetric convolutional neural network (ACNN). Specifically, ACNN is used to strengthen the effect of local key features and reduce the cost of model training. Graph attention is introduced into image compression to address the bottleneck problem of CNN in modeling long-range dependencies. Meanwhile, regarding the limitation that existing attention mechanisms for image compression hardly share information, we propose a self-attention approach which allows information flow to achieve reasonable bit allocation. The proposed self-attention approach is in compliance with the perceptual characteristics of human visual system, as information can interact with each other via attention modules. Moreover, the proposed self-attention approach takes into account channel-level relationship and positional information to promote the compression effect of rich-texture regions. Experimental results demonstrate that the proposed method achieves state-of-the-art rate-distortion performances after being optimized by MS-SSIM compared to recent deep compression models on the benchmark datasets of Kodak and Tecnick. The project page with the source code can be found in https://mic.tongji.edu.cn. © 2022 IEEE.
AB - Recent deep image compression methods have achieved prominent progress by using nonlinear modeling and powerful representation capabilities of neural networks. However, most existing learning-based image compression approaches employ customized convolutional neural network (CNN) to utilize visual features by treating all pixels equally, neglecting the effect of local key features. Meanwhile, the convolutional filters in CNN usually express the local spatial relationship within the receptive field and seldom consider the long-range dependencies from distant locations. This results in the long-range dependencies of latent representations not being fully compressed. To address these issues, an end-to-end image compression method is proposed by integrating graph attention and asymmetric convolutional neural network (ACNN). Specifically, ACNN is used to strengthen the effect of local key features and reduce the cost of model training. Graph attention is introduced into image compression to address the bottleneck problem of CNN in modeling long-range dependencies. Meanwhile, regarding the limitation that existing attention mechanisms for image compression hardly share information, we propose a self-attention approach which allows information flow to achieve reasonable bit allocation. The proposed self-attention approach is in compliance with the perceptual characteristics of human visual system, as information can interact with each other via attention modules. Moreover, the proposed self-attention approach takes into account channel-level relationship and positional information to promote the compression effect of rich-texture regions. Experimental results demonstrate that the proposed method achieves state-of-the-art rate-distortion performances after being optimized by MS-SSIM compared to recent deep compression models on the benchmark datasets of Kodak and Tecnick. The project page with the source code can be found in https://mic.tongji.edu.cn. © 2022 IEEE.
KW - asymmetric convolutional neural network
KW - Convolution
KW - Convolutional neural networks
KW - graph attention network
KW - Image coding
KW - Image compression
KW - Kernel
KW - Rate-distortion
KW - self-attention
KW - Training
KW - variational autoencoder
KW - Visualization
UR - http://www.scopus.com/inward/record.url?scp=85136852756&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85136852756&origin=recordpage
U2 - 10.1109/TCSVT.2022.3199472
DO - 10.1109/TCSVT.2022.3199472
M3 - RGC 21 - Publication in refereed journal
SN - 1051-8215
VL - 33
SP - 421
EP - 433
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
IS - 1
ER -