Abstract
Full-reference image quality assessment (FR-IQA) models, which serve as benchmarks for enhancement tasks such as denoising, superresolution, and compression, are vital in modern image processing. This work aims to develop FR-IQA models on the basis of two essential observations of the human visual system (HVS). First, the HVS can infer the pristine form of distorted images, but many deep network-based FR-IQA models fail to guide perceptual image enhancement. Second, the HVS tolerates slight content misalignment and can discern appealing content deformations, a capability that is lacking in many current FR-IQA methods.We first propose the deep network-based Wasserstein distance (DeepWSD) measure, which compares the pretrained features from the VGG16 network via the Wasserstein distance. The philosophy of the DeepWSD is supported by the efficient coding theory that HVS perceives images by matching signal statistics. DeepWSD attempts to relate the FR-IQA to the minimal conversion efforts between the distributions of the pristine and distorted images in the pretrained VGG domain, and such minimal effort is quantified as the perceptual quality indicator. The experimental results demonstrate that the proposed DeepWSD delivers accurate predictions over a series of datasets and has the advantage of strong generalization capability.
We then extend the DeepWSD into a set of novel FR-IQA models. The proposed measures compare features from pretrained deep image classification networks via the distribution distance measure, including the Wasserstein distance, the Jensen–Shannon divergence, and the symmetric Kullback–Leibler divergence, leading to three different deep network-based FR-IQA measures. These measures offer three main advantages. First, they provide reliable perceptual quality predictions without relying on training or fine-tuning on any IQA dataset. Second, they are backbone independent and adaptable across various network architectures. Third, these measures can guide perceptual image enhancement effectively without introducing artifacts. We explored their application in training deep image superresolution enhancement networks and found that they can guide the network in generating images with sharp edges and delicate textures.
The deep network-based distribution measures are oversensitive to content misalignment. We propose an FR-IQA model, deep order statistical similarity (DOSS), which is robust to content misalignment and can discern appealing content deformations. DOSS projects reference and distorted images into deep feature space, comparing sorted feature statistics with cosine similarity to produce final quality scores. DOSS has two main advantages. First, it tolerates minor shifts and deformations and captures appealing content changes. Second, it has strong texture perception, delivering excellent assessment results on images generated by texture synthesis algorithms. Experiments show that DOSS achieves competitive performance on both synthetic distortion-based IQA datasets and those with distortions from modern enhancement algorithms, suggesting that it is suitable for evaluating aligned and misaligned images.
The DOSS does not unify perceptual image evaluation and enhancement. We propose a dual-branch IQA (DBIQA) framework that can effectively guide perceptual enhancement and discern appealing content deformations. The DBIQA achieves this by capturing the joint degradation effects of pretrained deep network features via two branches. The first branch uses kernel representation similarity analysis (KRSA) to compare self-similarity matrices via the mean absolute error (MAE), whereas the second branch uses direct pairwise feature comparisons. The final score is derived through a training-free logarithmic summation of both branches. Our framework offers three key advantages. First, integrating the KRSA with pairwise comparisons enhances the model’s perceptual awareness. Second, DBIQA is adaptable to diverse network architectures. Third, DBIQA can guide perceptual image enhancement. Experiments on 10 datasets confirm the model’s efficacy, showing that measuring joint degradation effects can identify appealing content deformations across diverse IQA scenarios.
| Date of Award | 16 Jun 2025 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Hau San WONG (Supervisor) & Tak Wu Sam KWONG (External Co-Supervisor) |