No-reference Image and Video Quality Assessment

無參照圖像與視頻質量評價

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

  • Yuming LI

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date30 Jun 2016

Abstract

Image and video quality measurements are crucial for many applications, such as acquisition, compression, transmission, enhancement, and reproduction. Nowadays, no-reference (NR) image quality assessment (IQA) methods have attracted extensive attention because it does not rely on any information of original images. However, most of the conventional NR-IQA methods are designed only for one or a set of predefined specific image distortion types, which are unlikely to generalize for evaluating image/video distorted with other types of distortions. In order to estimate a wide range of image distortions, firstly, we presented an efficient general-purpose NR-IQA framework, which can be used to deal with many different distortion types and is easily extendible. Secondly, we extended the general idea of the proposed NRIQA framework and presented an efficient general-purpose no-reference (NR) video quality assessment (VQA) framework. Thirdly, in order to demonstrate that the proposed NR-IQA framework can be directly applied to practical applications, we presented an image quality based face liveness detection algorithm using this NRIQA framework. Finally, we presented how to incorporate the state-of-the-art deep learning techniques into the NR-IQA works and made a prospect of the future research about NR-IQA.
The general-purpose NR-IQA framework we proposed is based on shearlet transform and deep neural networks. In this framework, simple features are extracted by a new multiscale and multidirectional transform (shearlet transform) and the sum of subband coefficient amplitudes (SSCA) is utilized as primary features to describe the behavior of natural images and distorted images. Then, Stacked Auto-Encoders (SAE) are applied as ‘evolution process’ to ‘amplify’ the primary features and make them more discriminative. Finally, by translating the NR-IQA problem into classification problem, the differences of evolved features are identified by softmax classifier. Moreover, we have also incorporated some visualization techniques to analyze and visualize this NR-IQA framework. The resulting algorithm, which we name SESANIA (ShEarlet and Stacked Auto-encoders based No-reference Image quality Assessment) is tested on several database (LIVE, Multiply Distorted LIVE and TID2008) individually and combined together. Experimental results demonstrate the excellent performance of SESANIA, and we also give intuitive explanations of how it works and why it works well. In addition, SESANIA is extended to estimate quality in local regions. Further experiments demonstrate the local quality estimation ability of SESANIA on images with local distortions.
The general-purpose NR-VQA framework we proposed is based on 3D shearlet transform and Convolutional Neural Networks (CNN). Taking video blocks as input, simple and efficient primary spatiotemporal features are extracted by 3D shearlet transform, which are capable of capturing the Natural Scene Statistics (NSS) properties. Then, CNN and logistic regression are concatenated to exaggerate the discriminative parts of the primary features and predict a perceptual quality score. The resulting algorithm, which we name SACONVA (SheArlet and COnvolutional neural network based No-reference Video quality Assessment), is tested on well-known VQA databases of LIVE, IVPL and CSIQ. The testing results have demonstrated SACONVA performs well in predicting video quality and is competitive with current state-of-the-art full-reference VQA methods and general-purpose NR-VQA algorithms. Besides, SACONVA is extended to classify different video distortion types in these three databases and achieves excellent classification accuracy. In addition, we also demonstrate that SACONVA can be directly applied in real applications such as blind video denoising.
Face recognition is a widely used biometric technology due to its convenience but it is vulnerable to spoofing attacks made by non-real faces such as a photograph or video of valid user. Anti-spoof problem must be well resolved before widely applying face recognition in our daily life. Face liveness detection is a core technology to make sure that the input face is a live person. However, this is still very challenging using conventional liveness detection approaches of texture analysis and motion detection. To demonstrate that the proposed NR-IQA framework can be directly applied to practical applications, we presented an image quality based face liveness detection algorithm using our NR-IQA framework. We evaluated this approach using CASIA Face Anti-Spoofing Database and Replay-Attack Database. The results show that our approach performs better than state-of the-art techniques following the provided evaluation protocols of these databases, and is possible to significantly enhance the security of face recognition biometric system. In addition, experimental results also demonstrate that this framework can be easily extended to classify different spoofing attacks.
Nowadays, Deep Neural Networks have been applied to many applications (such as classification, denoising and inpainting) and achieved impressive performance. At the end of this thesis, we proposed to use deep CNN and some state-of-the-art training techniques to deal with NR-IQA problem and also made a prospect of the future research about NR-IQA.