Abstract
Face Sketch Recognition (FSR) is extremely challenging because of the heterogeneous gap between sketches and images. Relying on the ability to generative models, prior generation-based works have dominated FSR for a long time by decomposing FSR into two steps, namely, heterogeneous data synthesis and homogeneous data matching. However, decomposing FSR into two steps introduces noise and uncertainty, and the first step, heterogeneous data synthesis, is an even general and challenging problem. Solving a specific problem requires solving a more general one is to put the cart before the horse. In order to solve FSR smoothly and circumvent the above problems of generation-based methods, we propose a multi-view representation learning (MRL) framework based on Multivariate Loss and Hierarchical Loss (MvHi). Specifically, by using triplet loss as a bridge to connect the augmented representations generated by InfoNCE, we propose Multivariate Loss (Mv) to construct a more robust common feature subspace between sketches and images and directly solve FSR in this subspace. Moreover, Hierarchical Loss (Hi) is proposed to improve the training stability by utilizing the hidden states of the feature extractor. Comprehensive experiments on two commonly used datasets, CUFS and CUFSF, show that the proposed approach outperforms state-of-the-art methods by more than 7%. In addition, visualization experiments show that the proposed approach can extract the common representations among multi-view data compared to the baseline methods. © 2024 IEEE.
| Original language | English |
|---|---|
| Pages (from-to) | 2037-2049 |
| Journal | IEEE Transactions on Emerging Topics in Computational Intelligence |
| Volume | 8 |
| Issue number | 2 |
| Online published | 12 Feb 2024 |
| DOIs | |
| Publication status | Published - Apr 2024 |
Funding
This work was supported in part by InnoHK initiative The Government of the HKSAR and in part by Laboratory for AI-Powered Financial Technologies.
Research Keywords
- Face sketch recognition
- hierarchical loss
- multi-view representation learning
- multivariate loss