ECSNet : Spatio-Temporal Feature Learning for Event Camera

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

11 Scopus Citations
View graph of relations

Author(s)

  • Zhiwen Chen
  • Jinjian Wu
  • Leida Li
  • Weisheng Dong
  • Guangming Shi

Detail(s)

Original languageEnglish
Pages (from-to)701-712
Number of pages12
Journal / PublicationIEEE Transactions on Circuits and Systems for Video Technology
Volume33
Issue number2
Online published29 Aug 2022
Publication statusPublished - Feb 2023

Abstract

The neuromorphic event cameras can efficiently sense the latent geometric structures and motion clues of a scene by generating asynchronous and sparse event signals. Due to the irregular layout of the event signals, how to leverage their plentiful spatio-temporal information for recognition tasks remains a significant challenge. Existing methods tend to treat events as dense image-like or point-serie representations. However, they either suffer from severe destruction on the sparsity of event data or fail to encode robust spatial cues. To fully exploit their inherent sparsity with reconciling the spatio-temporal information, we introduce a compact event representation, namely 2D-1T event cloud sequence (2D-1T ECS). We couple this representation with a novel light-weight spatio-temporal learning framework (ECSNet) that accommodates both object classification and action recognition tasks. The core of our framework is a hierarchical spatial relation module. Equipped with specially designed surface-event-based sampling unit and local event normalization unit to enhance the inter-event relation encoding, this module learns robust geometric features from the 2D event clouds. And we propose a motion attention module for efficiently capturing long-term temporal context evolving with the 1T cloud sequence. Empirically, the experiments show that our framework achieves par or even better state-of-the-art performance. Importantly, our approach cooperates well with the sparsity of event data without any sophisticated operations, hence leading to low computational costs and prominent inference speeds. © 2022 IEEE.

Research Area(s)

  • Event Camera, Spatio-temporal Feature Learning, Object Classification, Action Recognition

Bibliographic Note

Research Unit(s) information for this publication is provided by the author(s) concerned.

Citation Format(s)

ECSNet: Spatio-Temporal Feature Learning for Event Camera. / Chen, Zhiwen; Wu, Jinjian; Hou, Junhui et al.
In: IEEE Transactions on Circuits and Systems for Video Technology, Vol. 33, No. 2, 02.2023, p. 701-712.

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review