3D Crowd Counting via Geometric Attention-Guided Multi-view Fusion
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Pages (from-to) | 3123–3139 |
Journal / Publication | International Journal of Computer Vision |
Volume | 130 |
Issue number | 12 |
Online published | 29 Sept 2022 |
Publication status | Published - Dec 2022 |
Link(s)
Abstract
Recently multi-view crowd counting using deep neural networks has been proposed to enable counting in large and wide scenes using multiple cameras. The current methods project the camera-view features to the average-height plane of the 3D world, and then fuse the projected multi-view features to predict a 2D scene-level density map on the ground (i.e., birds-eye view). Unlike the previous research, we consider the variable height of the people in the 3D world and propose to solve the multi-view crowd counting task through 3D feature fusion with 3D scene-level density maps, instead of the 2D density map on the ground-plane. Compared to 2D fusion, the 3D fusion extracts more information of the people along the z-dimension (height), which helps to address the scale variations across multiple views. The 3D density maps still preserve the 2D density maps property that the sum is the count, while also providing 3D information about the crowd density. Furthermore, instead of using the standard method of copying the features along the view ray in the 2D-to-3D projection, we propose an attention module based on a height estimation network, which forces each 2D pixels to be projected to one 3D voxel along the view ray. We also explore the projection consistency among the 3D prediction and the ground-truth in the 2D views to further enhance the counting performance. The proposed method is tested on the synthetic and real-world multi-view counting datasets and achieves better or comparable counting performance to the state-of-the-art.
Research Area(s)
- 2D-3D projection, 3D fusion, 3D projection, Crowd counting, geometric attention, height estimation
Citation Format(s)
3D Crowd Counting via Geometric Attention-Guided Multi-view Fusion. / Zhang, Qi; Chan, Antoni B.
In: International Journal of Computer Vision, Vol. 130, No. 12, 12.2022, p. 3123–3139.
In: International Journal of Computer Vision, Vol. 130, No. 12, 12.2022, p. 3123–3139.
Research output: Journal Publications and Reviews › RGC 21 - Publication in refereed journal › peer-review