TY - JOUR
T1 - Multi-Stage Visual Tracking with Siamese Anchor-Free Proposal Network
AU - Han, Guang
AU - Su, Jinpeng
AU - Liu, Yaoming
AU - Zhao, Yuqiu
AU - Kwong, Sam
PY - 2023
Y1 - 2023
N2 - The austere challenge of visual object tracking is to find the target to be tracked in various noise interference and obtain its accurate bounding box coordinates. Recently, the object tracking technology based on the Siamese network has made great breakthroughs, and more and more Siamese network trackers have been proposed with superior performance. They still have some shortcomings. To this end, a new Multi-Stage visual tracking algorithm with Siamese Anchor-Free Proposal Network (MS-SiamAFPN) is proposed in this paper. The algorithm is a three-stage Siamese network tracker composed of Feature Extraction and Fusion (FEF) sub-network, Classification and Regression (CR) sub-network, Validation and Regression (VR) sub-network in series. Firstly, the Anchor-Free Proposal Network (AFPN) module is designed in the CR stage, which can make full use of positive and negative samples for training while reducing neural network parameters. Secondly, aim to achieve better robustness and recognizability in the VR stage, on the one hand, a novel Feature Purification (FP) module is designed, which can automatically select the important channels, and extract the features of irregular regions on the input fusion features, so as to strengthen the representation ability of image features. On the other hand, the target recognition and position regression are regarded as different processing tasks, and the recognition score and position fine-tuning of candidate targets are obtained by newly designing the Dual-Branch Network (DBN) structure, thereby avoiding feature ambiguity. Due to the synergy of the above these innovations, MS-SiamAFPN has obtained a large performance improvement, and achieved SOTA performance in multiple public dataset benchmarks. © 2021 IEEE.
AB - The austere challenge of visual object tracking is to find the target to be tracked in various noise interference and obtain its accurate bounding box coordinates. Recently, the object tracking technology based on the Siamese network has made great breakthroughs, and more and more Siamese network trackers have been proposed with superior performance. They still have some shortcomings. To this end, a new Multi-Stage visual tracking algorithm with Siamese Anchor-Free Proposal Network (MS-SiamAFPN) is proposed in this paper. The algorithm is a three-stage Siamese network tracker composed of Feature Extraction and Fusion (FEF) sub-network, Classification and Regression (CR) sub-network, Validation and Regression (VR) sub-network in series. Firstly, the Anchor-Free Proposal Network (AFPN) module is designed in the CR stage, which can make full use of positive and negative samples for training while reducing neural network parameters. Secondly, aim to achieve better robustness and recognizability in the VR stage, on the one hand, a novel Feature Purification (FP) module is designed, which can automatically select the important channels, and extract the features of irregular regions on the input fusion features, so as to strengthen the representation ability of image features. On the other hand, the target recognition and position regression are regarded as different processing tasks, and the recognition score and position fine-tuning of candidate targets are obtained by newly designing the Dual-Branch Network (DBN) structure, thereby avoiding feature ambiguity. Due to the synergy of the above these innovations, MS-SiamAFPN has obtained a large performance improvement, and achieved SOTA performance in multiple public dataset benchmarks. © 2021 IEEE.
KW - anchor-free
KW - Deep learning
KW - Feature extraction
KW - feature purification
KW - Interference
KW - Object tracking
KW - Siamese network
KW - Target tracking
KW - Task analysis
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85119443955&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85119443955&origin=recordpage
U2 - 10.1109/TMM.2021.3127357
DO - 10.1109/TMM.2021.3127357
M3 - RGC 21 - Publication in refereed journal
SN - 1520-9210
VL - 25
SP - 430
EP - 442
JO - IEEE Transactions on Multimedia
JF - IEEE Transactions on Multimedia
ER -