Learning Semantic Alignment using Global Features and Multi-scale Confidence

Huaiyuan Xu, Jing Liao, Huaping Liu, Yuxiang Sun*

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

2 Citations (Scopus)

Abstract

Semantic alignment aims to establish pixel correspondences between images based on semantic consistency. It can serve as a fundamental component for various downstream computer vision tasks, such as style transfer and exemplar-based colorization, etc. Many existing methods use local features and their cosine similarities to infer semantic alignment. However, they struggle with significant intra-class variation of objects, such as appearance, size, etc. In other words, contents with the same semantics tend to be significantly different in vision. To address this issue, we propose a novel deep neural network of which the core lies in global feature enhancement and adaptive multi-scale inference. Specifically, two modules are proposed: an enhancement transformer for enhancing semantic features with global awareness; a probabilistic correlation module for adaptively fusing multi-scale information based on the learned confidence scores. We use the unified network architecture to achieve two types of semantic alignment, namely, cross-object semantic alignment and cross-domain semantic alignment. Experimental results demonstrate that our method achieves competitive performance on five standard cross-object semantic alignment benchmarks, and outperforms the state of the arts in cross-domain semantic alignment.

© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
Original languageEnglish
Pages (from-to)897-910
Number of pages14
JournalIEEE Transactions on Circuits and Systems for Video Technology
Volume34
Issue number2
Online published21 Jun 2023
DOIs
Publication statusPublished - Feb 2024

Bibliographical note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Fingerprint

Dive into the research topics of 'Learning Semantic Alignment using Global Features and Multi-scale Confidence'. Together they form a unique fingerprint.

Cite this