Maximum-margin structured learning with deep networks for 3D human pose estimation
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Title of host publication | Proceedings of the IEEE International Conference on Computer Vision |
Publisher | Institute of Electrical and Electronics Engineers, Inc. |
Pages | 2848-2856 |
Volume | 11-18-December-2015 |
ISBN (print) | 9781467383912 |
Publication status | Published - Dec 2015 |
Publication series
Name | |
---|---|
Volume | 11-18-December-2015 |
ISSN (Print) | 1550-5499 |
Conference
Title | 15th IEEE International Conference on Computer Vision (ICCV 2015) |
---|---|
Place | Chile |
City | Santiago |
Period | 11 - 18 December 2015 |
Link(s)
Abstract
This paper focuses on structured-output learning using deep neural networks for 3D human pose estimation from monocular images. Our network takes an image and 3D pose as inputs and outputs a score value, which is high when the image-pose pair matches and low otherwise. The network structure consists of a convolutional neural network for image feature extraction, followed by two sub-networks for transforming the image features and pose into a joint embedding. The score function is then the dot-product between the image and pose embeddings. The image-pose embedding and score function are jointly trained using a maximum-margin cost function. Our proposed framework can be interpreted as a special form of structured support vector machines where the joint feature space is discriminatively learned using deep neural networks. We test our framework on the Human3.6m dataset and obtain state-of-the-art results compared to other recent methods. Finally, we present visualizations of the image-pose embedding space, demonstrating the network has learned a high-level embedding of body-orientation and pose-configuration.
Citation Format(s)
Maximum-margin structured learning with deep networks for 3D human pose estimation. / Li, Sijin; Zhang, Weichen; Chan, Antoni B.
Proceedings of the IEEE International Conference on Computer Vision. Vol. 11-18-December-2015 Institute of Electrical and Electronics Engineers, Inc., 2015. p. 2848-2856 7410683.
Proceedings of the IEEE International Conference on Computer Vision. Vol. 11-18-December-2015 Institute of Electrical and Electronics Engineers, Inc., 2015. p. 2848-2856 7410683.
Research output: Chapters, Conference Papers, Creative and Literary Works › RGC 32 - Refereed conference paper (with host publication) › peer-review