VITAL : VIsual Tracking via Adversarial Learning

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review

262 Scopus Citations
View graph of relations

Author(s)

  • Chao Ma
  • Xiaohe Wu
  • Lijun Gong
  • Linchao Bao
  • Wangmeng Zuo
  • Chunhua Shen
  • Ming-Hsuan Yang

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
Subtitle of host publicationCVPR 2018
PublisherIEEE
Pages8990-8999
ISBN (Electronic)9781538664209
ISBN (Print)9781538664216
Publication statusPublished - Jun 2018

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919
ISSN (Electronic)2575-7075

Conference

TitleThe Thirtieth IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2018)
PlaceUnited States
CitySalt Lake City
Period18 - 22 June 2018

Abstract

The tracking-by-detection framework consists of two stages, i.e., drawing samples around the target object in the first stage and classifying each sample as the target object or as background in the second stage. The performance of existing trackers using deep classification networks is limited by two aspects. First, the positive samples in each frame are highly spatially overlapped, and they fail to capture rich appearance variations. Second, there exists extreme class imbalance between positive and negative samples. This paper presents the VITAL algorithm to address these two problems via adversarial learning. To augment positive samples, we use a generative network to randomly generate masks, which are applied to adaptively dropout input features to capture a variety of appearance changes. With the use of adversarial learning, our network identifies the mask that maintains the most robust features of the target objects over a long temporal span. In addition, to handle the issue of class imbalance, we propose a high-order cost sensitive loss to decrease the effect of easy negative samples to facilitate training the classification network. Extensive experiments on benchmark datasets demonstrate that the proposed tracker performs favorably against state-of-the-art approaches.

Bibliographic Note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Citation Format(s)

VITAL : VIsual Tracking via Adversarial Learning. / Song, Yibing; Ma, Chao; Wu, Xiaohe; Gong, Lijun; Bao, Linchao; Zuo, Wangmeng; Shen, Chunhua; Lau, Rynson W.H.; Yang, Ming-Hsuan.

Proceedings: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition: CVPR 2018. IEEE, 2018. p. 8990-8999 8579035 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review