Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 1893–1909 |
Number of pages | 17 |
Journal / Publication | International Journal of Computer Vision |
Volume | 129 |
Issue number | 6 |
Online published | 19 Apr 2021 |
Publication status | Published - Jun 2021 |
Link(s)
Abstract
To recognize objects of the unseen classes, most existing Zero-Shot Learning(ZSL) methods first learn a compatible projection function between the common semantic space and the visual space based on the data of source seen classes, then directly apply it to the target unseen classes. However, for data in the wild, distributions between the source and target domain might not match well, thus causing the well-known domain shift problem. Based on the observation that visual features of test instances can be separated into different clusters, we propose a new visual structure constraint on class centers for transductive ZSL, to improve the generality of the projection function (i.e.alleviate the above domain shift problem). Specifically, three different strategies (symmetric Chamfer-distance, Bipartite matching distance, and Wasserstein distance) are adopted to align the projected unseen semantic centers and visual cluster centers of test instances. We also propose two new training strategies to handle the data in the wild, where many unrelated images in the test dataset may exist. This realistic setting has never been considered in previous methods. Extensive experiments demonstrate that the proposed visual structure constraint brings substantial performance gain consistently and the new training strategies make it generalize well for data in the wild. The source code is available at https://github.com/raywzy/VSC.
Research Area(s)
- Computer vision, Visual structure constraint, Zero-shot learning
Bibliographic Note
Research Unit(s) information for this publication is provided by the author(s) concerned.
Citation Format(s)
Visual Structure Constraint for Transductive Zero-Shot Learning in the Wild. / Wan, Ziyu; Chen, Dongdong; Liao, Jing.
In: International Journal of Computer Vision, Vol. 129, No. 6, 06.2021, p. 1893–1909.
In: International Journal of Computer Vision, Vol. 129, No. 6, 06.2021, p. 1893–1909.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review