Abstract
Event-based object recognition has drawn increasing attention for event cameras’ distinguished advantages of low power consumption and high dynamic range. For this new modality, previous works based on customizing low-level descriptors are vulnerable to noise and with limited generalizability. Although recent works turn to design various deep neural networks to extract event features, they either suffer from data insufficiency to fully train the event-based model or fail to encode spatial and temporal cues simultaneously with their single view network. In this work, we address these limitations by proposing a multi-view attention-aware network, in which an event stream is projected to multi-view 2D maps to utilize well-trained 2D models and explore spatio-temporal complements. Besides, the attention mechanism is used to boost the complements in different streams for better joint inference. Comprehensive experiments show the large superiority of our model over state-of-the-art methods as well as the efficacy of our multi-view fusion framework for event data.
| Original language | English |
|---|---|
| Pages (from-to) | 8275-8284 |
| Number of pages | 10 |
| Journal | IEEE Transactions on Circuits and Systems for Video Technology |
| Volume | 32 |
| Issue number | 12 |
| Online published | 16 Apr 2021 |
| DOIs | |
| Publication status | Published - Dec 2022 |
Research Keywords
- Attention
- Cameras
- Data models
- Event Data
- Feature extraction
- Multi-view
- Object Categorization
- Power demand
- Streaming media
- Task analysis
- Three-dimensional displays
Fingerprint
Dive into the research topics of 'MVF-Net: A Multi-view Fusion Network for Event-based Object Classification'. Together they form a unique fingerprint.Projects
- 1 Finished
-
GRF: Gaze Tracking and its Integration with Human-Robot Cooperation
LI, Y. F. (Principal Investigator / Project Coordinator) & CHEN, H. (Co-Investigator)
1/01/21 → 24/06/25
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver