Maximum-margin structured learning with deep networks for 3D human pose estimation

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review

160 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Title of host publicationProceedings of the IEEE International Conference on Computer Vision
PublisherInstitute of Electrical and Electronics Engineers, Inc.
Pages2848-2856
Volume11-18-December-2015
ISBN (print)9781467383912
Publication statusPublished - Dec 2015

Publication series

Name
Volume11-18-December-2015
ISSN (Print)1550-5499

Conference

Title15th IEEE International Conference on Computer Vision (ICCV 2015)
PlaceChile
CitySantiago
Period11 - 18 December 2015

Abstract

This paper focuses on structured-output learning using deep neural networks for 3D human pose estimation from monocular images. Our network takes an image and 3D pose as inputs and outputs a score value, which is high when the image-pose pair matches and low otherwise. The network structure consists of a convolutional neural network for image feature extraction, followed by two sub-networks for transforming the image features and pose into a joint embedding. The score function is then the dot-product between the image and pose embeddings. The image-pose embedding and score function are jointly trained using a maximum-margin cost function. Our proposed framework can be interpreted as a special form of structured support vector machines where the joint feature space is discriminatively learned using deep neural networks. We test our framework on the Human3.6m dataset and obtain state-of-the-art results compared to other recent methods. Finally, we present visualizations of the image-pose embedding space, demonstrating the network has learned a high-level embedding of body-orientation and pose-configuration.

Citation Format(s)

Maximum-margin structured learning with deep networks for 3D human pose estimation. / Li, Sijin; Zhang, Weichen; Chan, Antoni B.
Proceedings of the IEEE International Conference on Computer Vision. Vol. 11-18-December-2015 Institute of Electrical and Electronics Engineers, Inc., 2015. p. 2848-2856 7410683.

Research output: Chapters, Conference Papers, Creative and Literary WorksRGC 32 - Refereed conference paper (with host publication)peer-review