Data-driven Discriminative Feature Learning for Biometric Vein Recognition and Face Recognition

數據驅動特徵學習的靜脈識別及面部識別

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date24 Aug 2021

Abstract

Biometric recognition technology exploits the unique and stable biological traits involved with the human body to recognize a person, therefore eliminates the need for carrying physical keys or certificates as well as memorizing passwords. It has found many important applications in our daily lives, such as access control to personal device, check-in at train stations and hotels, smart home, and so on. Vein recognition is a promising biometric technique that utilizes the pattern of vein beneath the skin surface to achieve identity recognition. Since the vein is covered by the skin, it is difficult to be stolen and forged, thereby inherently increases the security of a biometric system. Face recognition is a popular biometric technique that has achieved impressive performance recently, driven by the availability of large-scale dataset and deep learning techniques. Despite the weaknesses in privacy protection and security, it still plays an important role in many applications such as surveillance and criminal identification. In practical applications, the biometric images may be presented with large intra-class variations and high inter-class similarity, which poses a great challenge for achieving robust identity recognition in a biometric system. Therefore, this thesis is dedicated to investigating effective feature learning techniques to improve the performance of biometric vein recognition and face recognition.


Finger vein recognition (FVR) based on deep learning (DL) has gained rising attention in recent years. However, the performance of FVR is limited by the insufficient amount of finger vein training data and the weak generalization of learned features. To address this issue, we propose a simple finger vein verification framework by systematically optimizing the training pipeline. We design a simple flipping-based inter-class data augmentation technique that can double the number of finger vein training classes with new vein patterns. This technique can be combined with the traditional intra-class data augmentation methods to achieve more effective data augmentation. In order to enhance the discrimination of features, we design a fusion loss by incorporating the classification loss and the metric learning loss. We find that the fusion of these two penalty signals will lead to a good trade-off between the intra-class similarity and inter-class separability, thereby improving the generalization ability of learned features. To examine the reliability and efficiency of our proposed framework, we develop a deep learning-based finger vein verification system (DeepFV) to perform end-to-end biometric verification under simulated real-world conditions. In challenging open-set evaluation protocol, extensive experiments conducted on three public finger vein databases and an in-house database confirm the effectiveness of the proposed method.


We conduct a deeper study to tackle the data shortage challenge for deep learning-based biometric vein verification. The traditional data augmentation approaches can only augment the intra-class samples. The flipping-based inter-class data augmentation technique can only generate fixed vein classes based on the original classes. All these data augmentation approaches are restricted either to the intra-class space or to the fixed inter-class space, which compromises the performance. To achieve more effective data augmentation, we propose a GAN-based approach to generate arbitrary-pattern vein images, which can augment the training data with arbitrary vein classes. The proposed approach consists of three progressive synthetic steps, namely, generation of random vein patterns in the binary space, refinement of the binary vein patterns, and rendering them into grayscale vein images. In order to learn effective feature representations, we adopt the unsupervised contrastive learning technique to learn data augmentation-invariant and instance-separating representations from the synthetic samples. After that, we further perform supervised fine-tuning on the real training data. We evaluate our proposed methods on the well-known public finger vein and palm vein databases based on verification tasks. Experimental results show that we can generate high-fidelity and diverse vein image samples, thereby effectively alleviating the problem of data shortage and improving the learning of feature representations.


The afore-proposed approach for vein image generation requires multiple synthetic steps and relies on the binary vein pattern information, which is inconvenient for practical use. To overcome this limitation, we propose an unconditional generative self-supervised contrastive learning (GSCL) framework to learn better features for biometric vein verification. Our proposed framework is simple and effective - it firstly learns to generate unlabelled vein image samples using the powerful style-based generative model. We found that the StyleGAN2 generator can generate distinct vein samples, which allows us to perform self-supervised contrastive learning (SCL) on synthetic samples to learn discriminative features in the pretraining stage. By investigating two representative SCL frameworks, SimCLR and BYOL, we found that negative training samples are crucial for improving feature quality when training on discriminative synthetic samples, so SimCLR is adopted for the SCL pretraining. After that, the embedding model is finetuned with the original training data in a supervised manner to further improve the representation quality. In addition, a new clustering-based enrollment method is also investigated to improve performance of practical applications. Extensive experiments conducted on two well-known public finger vein and palm vein databases as well as a newly-collected in-house finger vein video database demonstrate the efficacy of GSCL and its superior performance. Through improved feature extraction and enrollment methods, our DeepFV-2 system can achieve significantly improved verification performance under practical working conditions.


Although face recognition does not suffer from the data shortage problem, it still remains challenging due to the harsh conditions (such as occlusion, viewpoint, illumination, low resolution, etc) in real-world applications. In this thesis, we dedicate to the design of loss function to enhance the intra-class similarity and inter-class separability of facial features. We propose a Linear-Cosine Softmax Loss (LinCos-Softmax) to more effectively learn angle-discriminative facial features. The main characteristic of the loss function is the use of an approximated linear logit through Taylor expansion. Compared with the conventional cosine logit, it has a stronger linear relationship with the angle on enhancing angular discrimination. We also design an automatic scale parameter selection scheme, which can conveniently provide an appropriate scale for different logits without the need for exhaustive parameter search to improve performance. In addition, we develop a margin-enhanced Linear-Cosine Softmax Loss (m-LinCos-Softmax) to further enlarge inter-class distances and reduce intra-class variations. Experimental results on typical face recognition benchmarks demonstrate the effectiveness of the proposed method and its superiority to existing angular softmax loss variants.