On the Fusion of RGB and Depth Information for Hand Pose Estimation

Evangelos Kazakos, Christophoros Nikou, Ioannis A. Kakadiaris

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Recent advances in deep learning have spurred 3D hand pose estimation, as convolutional network (ConvNet) based methods outperformed random forests. However, in the state of the art, ConvNet based methods employ only depth images of the hand without leveraging color and texture information from the RGB domain. In this paper, we investigate whether ConvNets can learn more rich and discriminative em-beddings, by combining RGB and depth information. To answer this question, we propose the fusion of RGB and depth information in a double-stream architecture. More specifically, RGB and depth images are fed into two separate networks by extracting features, which are subsequently fused at an intermediate layer of the ConvNet, implementing input-level fusion, feature-level fusion and score-level fusion. The double-stream scheme is coupled with a deep ConvNet, contrary to the shallow networks that are mostly proposed in the literature. Experimental results show that while the depth of the network is crucial for hand pose estimation, the double-stream nets perform very similarly with the net trained only with depth images. This may suggest that training double-stream architectures purely with supervision may be insufficient for hand pose estimation with RGB-D fusion.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages868-872
Number of pages5
ISBN (Electronic)9781479970612
DOIs
StatePublished - Aug 29 2018
Event25th IEEE International Conference on Image Processing, ICIP 2018 - Athens, Greece
Duration: Oct 7 2018Oct 10 2018

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference25th IEEE International Conference on Image Processing, ICIP 2018
Country/TerritoryGreece
CityAthens
Period10/7/1810/10/18

Keywords

  • Deep learning
  • Double-stream networks
  • Fusion
  • Hand pose estimation
  • Rgb-d

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Signal Processing

Fingerprint

Dive into the research topics of 'On the Fusion of RGB and Depth Information for Hand Pose Estimation'. Together they form a unique fingerprint.

Cite this