A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review

View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings of ICPR 2020
Subtitle of host publication25th International Conference on Pattern Recognition
PublisherInstitute of Electrical and Electronics Engineers
Pages2771-2778
ISBN (Electronic)978-1-7281-8808-9
ISBN (Print)978-1-7281-8809-6
Publication statusPublished - Jan 2021

Publication series

NameProceedings - International Conference on Pattern Recognition
ISSN (Print)1051-4651

Conference

Title25th International Conference on Pattern Recognition (ICPR2020)
LocationVirtual
PlaceItaly
CityMilan
Period10 - 15 January 2021

Abstract

This paper addresses the problem of recognizing human-human interaction from skeletal sequences. Existing methods are mainly designed to classify single human action. Many of them simply stack the movement features of two characters to deal with human interaction, while neglecting the abundant relationships between characters. In this paper, we propose a novel two-stream recurrent neural network by adopting the geometric features from both single actions and interactions to describe the spatial correlations with different discriminative abilities. The first stream is constructed under pairwise joint distance (PJD) in a fully-connected mesh to categorize the interactions with explicit distance patterns. To better distinguish similar interactions, in the second stream, we combine PJD with the spatial features from individual joint positions using graph convolutions to detect the implicit correlations among joints, where the joint connections in the graph are adaptive for flexible correlations. After spatial modeling, each stream is fed to a bi-directional LSTM to encode two-way temporal properties. To take advantage of the diverse discriminative power of the two streams, we come up with a late fusion algorithm to combine their output predictions concerning information entropy. Experimental results show that the proposed framework achieves state-of-the-art performance on 3D and comparable performance on 2D interaction datasets. Moreover, the late fusion results demonstrate the effectiveness of improving the recognition accuracy compared with single streams.

Citation Format(s)

A Two-Stream Recurrent Network for Skeleton-based Human Interaction Recognition. / Men, Qianhui; Ho, Edmond S. L.; Shum, Hubert P. H.; Leung, Howard.

Proceedings of ICPR 2020: 25th International Conference on Pattern Recognition. Institute of Electrical and Electronics Engineers, 2021. p. 2771-2778 9412538 (Proceedings - International Conference on Pattern Recognition).

Research output: Chapters, Conference Papers, Creative and Literary Works (RGC: 12, 32, 41, 45)32_Refereed conference paper (with ISBN/ISSN)peer-review