Multi-Scale correlation module for video-based facial expression recognition in the wild

Tankun Li, Kwok-Leung Chan*, Tardi Tjahjadi

*Corresponding author for this work

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

6 Citations (Scopus)

Abstract

The detection of facial muscle movements (e.g., mouth opening) is crucial for facial expression recognition (FER). However, extracting these facial motion features is challenging for a deep-learning recognition system for the following reasons: (1) without explicit labels of motion for training, there is no guarantee that convolutional neural networks (CNNs) can extract motions effectively; (2) compared to human action recognition (e.g., the object moving from left to right), some facial motions (e.g., raising eyebrows) are more subtle and thus harder to extract; and (3) the use of optical flow to extract motion features is time-consuming when using a commonly-used camera. In this work, we propose a Multi-Scale Correlation Module (MSCM) together with an adaptive fusion. Firstly, large as well as small facial motions are extracted by MSCM and encoded by CNNs. Then, an adaptive fusion module is used to aggregate motion features. With these modules, our recognition network is able to model both subtle and large motion features for video-based FER with only the RGB image frames as input. Experiments on two datasets, AFEW and DFEW, show that the network achieves state-of-art performances on the benchmarks. © 2023 Elsevier Ltd. All rights reserved.

Original languageEnglish
Article number109691
JournalPattern Recognition
Volume142
Online published13 May 2023
DOIs
Publication statusPublished - Oct 2023

Funding

The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region , China (Project No. CityU 11202319 ). The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CityU 11202319).

Research Keywords

  • Adaptive fusion
  • Convolutional neural networks
  • Facial expression recognition
  • Motion estimation

Fingerprint

Dive into the research topics of 'Multi-Scale correlation module for video-based facial expression recognition in the wild'. Together they form a unique fingerprint.

Cite this