TY - JOUR
T1 - MHPE
T2 - Learning Morphology Relationships for Robust Head Pose Estimation with Facial Rotation Representation
AU - Liu, Tingting
AU - Ju, Jianping
AU - Song, Zhixiong
AU - Qian, Shijia
AU - Rao, Ning
AU - Liu, Hai
AU - Li, You-Fu
PY - 2026/1/5
Y1 - 2026/1/5
N2 - Although accurate head pose estimation is critical for natural human-computer interaction, it remains challenging due to occlusion, extreme poses, illumination conditions, and data ambiguity issues. To address these challenges, a novel morphology aware Transformer framework (MHPE) is proposed, which can learn morphological relationships during facial rotation. The methodology is based on two key findings: cross-region geometric dependencies and angle-specific morphodynamic representations. The proposed framework incorporates two key components: adversarial feature generation, which generates robust rotation representations by adaptive multi-scale feature interaction; and morphology relationship inference, which establishes long-range dependencies between facial features through a cross-modal attention mechanism that incorporates morphological priors. Extensive evaluations on three demanding benchmarks (BIWI, AFLW2000, and 300W-LP) demonstrate state-of-the-art performance, particularly in demanding scenarios. The Python implementation will be available on request to facilitate reproducibility.
AB - Although accurate head pose estimation is critical for natural human-computer interaction, it remains challenging due to occlusion, extreme poses, illumination conditions, and data ambiguity issues. To address these challenges, a novel morphology aware Transformer framework (MHPE) is proposed, which can learn morphological relationships during facial rotation. The methodology is based on two key findings: cross-region geometric dependencies and angle-specific morphodynamic representations. The proposed framework incorporates two key components: adversarial feature generation, which generates robust rotation representations by adaptive multi-scale feature interaction; and morphology relationship inference, which establishes long-range dependencies between facial features through a cross-modal attention mechanism that incorporates morphological priors. Extensive evaluations on three demanding benchmarks (BIWI, AFLW2000, and 300W-LP) demonstrate state-of-the-art performance, particularly in demanding scenarios. The Python implementation will be available on request to facilitate reproducibility.
KW - Geometric dependencies
KW - Head pose estimation
KW - Morphology representation learning
KW - Morphology-aware
KW - Transformer
UR - https://www.scopus.com/pages/publications/105027693761
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-105027693761&origin=recordpage
U2 - 10.1109/TCSVT.2026.3651198
DO - 10.1109/TCSVT.2026.3651198
M3 - RGC 21 - Publication in refereed journal
SN - 1051-8215
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
ER -