TY - JOUR
T1 - Attentional Feature Fusion for End-to-End Blind Image Quality Assessment
AU - Zhou, Mingliang
AU - Lang, Shujun
AU - Zhang, Taiping
AU - Liao, Xingran
AU - Shang, Zhaowei
AU - Xiang, Tao
AU - Fang, Bin
PY - 2023/3
Y1 - 2023/3
N2 - In this paper, an end-to-end blind image quality assessment (BIQA) model based on feature fusion with an attention mechanism is proposed. We extracted the multilayer features of the image and fused them based on the attention mechanism; the fused features are then mapped into score, and the image quality assessment without reference is realized. First, because the human visual perception system hierarchically approaches the input information from local to global, we used three different neural networks to extract physically meaningful image features, and we use modified VGG19 and modified VGG16 to extract the substrate texture information and the local information of the edges, respectively. Meanwhile, we use the resNet50 to extract high-level global semantic information. Second, to take full advantage of multilevel features and avoid monotonic addition in hierarchical feature fusion, we adopt an attention-based feature fusion mechanism that combines the global and local contexts of the features and assigns different weights to the features to be fused, so that the model can perceive richer types of distortion. Experimental findings on six standard databases show that our approach yields improved performance.
AB - In this paper, an end-to-end blind image quality assessment (BIQA) model based on feature fusion with an attention mechanism is proposed. We extracted the multilayer features of the image and fused them based on the attention mechanism; the fused features are then mapped into score, and the image quality assessment without reference is realized. First, because the human visual perception system hierarchically approaches the input information from local to global, we used three different neural networks to extract physically meaningful image features, and we use modified VGG19 and modified VGG16 to extract the substrate texture information and the local information of the edges, respectively. Meanwhile, we use the resNet50 to extract high-level global semantic information. Second, to take full advantage of multilevel features and avoid monotonic addition in hierarchical feature fusion, we adopt an attention-based feature fusion mechanism that combines the global and local contexts of the features and assigns different weights to the features to be fused, so that the model can perceive richer types of distortion. Experimental findings on six standard databases show that our approach yields improved performance.
KW - attentional feature fusion
KW - blind image quality assessment
KW - Data mining
KW - Distortion
KW - end-to-end
KW - Feature extraction
KW - Hierarchical feature extraction
KW - Image quality
KW - Neural networks
KW - Quality assessment
KW - Semantics
UR - http://www.scopus.com/inward/record.url?scp=85139451559&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85139451559&origin=recordpage
U2 - 10.1109/TBC.2022.3204235
DO - 10.1109/TBC.2022.3204235
M3 - RGC 21 - Publication in refereed journal
SN - 0018-9316
VL - 69
SP - 144
EP - 152
JO - IEEE Transactions on Broadcasting
JF - IEEE Transactions on Broadcasting
IS - 1
ER -