View-Adaptive Graph Neural Network for Action Recognition

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

8 Scopus Citations
View graph of relations

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)969-978
Journal / PublicationIEEE Transactions on Cognitive and Developmental Systems
Volume15
Issue number2
Online published6 Sept 2022
Publication statusPublished - Jun 2023

Abstract

Skeleton-based recognition of human actions has received attention in recent years because of the popularity of 3D acquisition sensors. Existing studies use 3D skeleton data from video clips collected from several views. The body view shifts from the camera perspective when humans perform certain actions, resulting in unstable and noisy skeletal data. In this paper, we developed a view-adaptive (VA) mechanism that identifies the viewpoints across the sequence and transforms the skeleton view through a data-driven learning process to counteract influence of variations. Most existing methods use fixed human-defined prior criterion to reposition skeletons. We utilised an unsupervised reposition approach and jointly designed a VA neural network based on the graph neural network (GNN). Our VA-GNN model can transform the skeletons of distinct views into a considerably more consistent virtual perspective over preprocessing approach. The VA module learns the best-observed view because it determines the most suitable view and transforms the skeletons from the action sequence for end-to-end recognition along with suited graph topology with adaptive GNN. Thus, our strategy reduces the influence of view variance, allowing networks to focus on learning action-specific properties and resulting in improved performance. The accuracy achieved by the experiments on the four benchmark datasets. © 2022 IEEE.

Research Area(s)

  • 3-D skeleton, Action recognition, Bones, Cameras, graph convolution neural network, Joints, Spatiotemporal phenomena, Three-dimensional displays, Transforms, Urban areas, View-adaptive (VA)