
Consistent-Resolution Network for 3D Hand Shape Estimation from a Single RGB Image
Author(s) -
Qi Wu,
Joya Chen,
Zhiming Yao,
Xu Zhou,
Jianguo Wang,
Shaonan Wang,
Xiaobo Yang
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1631/1/012014
Subject(s) - artificial intelligence , computer vision , leverage (statistics) , computer science , rgb color model , image resolution , monocular , deconvolution , representation (politics) , feature (linguistics) , image (mathematics) , pattern recognition (psychology) , algorithm , linguistics , philosophy , politics , political science , law
We propose a novel method for 3D hand shape estimation from a single RGB image. Most exiting methods leverage a deep network to extract a low-resolution representation to estimate 3D coordinates, which always leads to the loss of spatial information. In contrast, we present a Consistent-Resolution Network (CRNet) to extract the same resolution representation as the original image, thus preserve more details about spatial information. Specifically, we introduce the recent high-resolution network (HRNet) to generate high-resolution feature maps, which can attain high-resolution representation of the original image. Then, we design a deconvolution module to recover this map to the size of the original image. Therefore, we can directly leverage this feature to learn the precise 2D shape and the depth map, and transfer them into 3D coordinates in the camera space. Through extensive experiments on a large real-world dataset FreiHAND, we show that our proposed method can predict precise and suitable 3D hand shape from a monocular view.