TokenHPE : Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition CVPR 2023 |
Publisher | Institute of Electrical and Electronics Engineers, Inc. |
Pages | 8897-8906 |
ISBN (electronic) | 979-8-3503-0129-8 |
ISBN (print) | 979-8-3503-0130-4 |
Publication status | Published - 2023 |
Publication series
Name | Proceedings - IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
---|---|
ISSN (Print) | 1063-6919 |
ISSN (electronic) | 2575-7075 |
Conference
Title | 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2023) |
---|---|
Location | Vancouver Convention Center |
Place | Canada |
City | Vancouver |
Period | 18 - 22 June 2023 |
Link(s)
DOI | DOI |
---|---|
Permanent Link | https://scholars.cityu.edu.hk/en/publications/publication(39da1a18-dce2-4a96-abe1-01a71163c957).html |
Abstract
Head pose estimation (HPE) has been widely used in the fields of human machine interaction, self-driving, and attention estimation. However, existing methods cannot deal with extreme head pose randomness and serious occlusions. To address these challenges, we identify three cues from head images, namely, neighborhood similarities, significant facial changes, and critical minority relationships. To leverage the observed findings, we propose a novel critical minority relationship-aware method based on the Transformer architecture in which the facial part relationships can be learned. Specifically, we design several orientation tokens to explicitly encode the basic orientation regions. Meanwhile, a novel token guide multiloss function is designed to guide the orientation tokens as they learn the desired regional similarities and relationships. We evaluate the proposed method on three challenging benchmark HPE datasets. Experiments show that our method achieves better performance compared with state-of-the-art methods. Our code is publicly available at https://github.com/zc2023/TokenHPE. ©2023 IEEE.
Citation Format(s)
TokenHPE: Learning Orientation Tokens for Efficient Head Pose Estimation via Transformers. / Zhang, Cheng; Liu, Hai; Deng, Yongjian et al.
Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition CVPR 2023. Institute of Electrical and Electronics Engineers, Inc., 2023. p. 8897-8906 (Proceedings - IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition CVPR 2023. Institute of Electrical and Electronics Engineers, Inc., 2023. p. 8897-8906 (Proceedings - IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review