z-logo
open-access-imgOpen Access
A Semi-Supervised Approach to Monocular Depth Estimation, Depth Refinement, and Semantic Segmentation of Driving Scenes using a Siamese Triple Decoder Architecture
Author(s) -
John Paul T. Yusiong,
Prospero C. Naval
Publication year - 2020
Publication title -
informatica
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.172
H-Index - 34
eISSN - 1854-3871
pISSN - 0350-5596
DOI - 10.31449/inf.v44i4.3018
Subject(s) - computer science , segmentation , artificial intelligence , ground truth , monocular , task (project management) , context (archaeology) , depth map , computer vision , semantics (computer science) , image (mathematics) , pattern recognition (psychology) , paleontology , management , economics , biology , programming language
Depth estimation and semantic segmentation are two fundamental tasks in scene understanding. These two tasks are usually solved separately, although they have complementary properties and are highly correlated. Jointly solving these two tasks is very beneficial for real-world applications that require both geometric and semantic information. Within this context, the paper presents a unified learning framework for generating a refined depth estimation map and semantic segmentation map given a single image. Specifically, this paper proposes a novel architecture called JDSNet. JDSNet is a Siamese triple decoder architecture that can simultaneously perform depth estimation, depth refinement, and semantic labeling of a scene from an image by exploiting the interaction between depth and semantic information. A semi-supervised method is used to train JDSNet to learn features for both tasks where geometry-based image reconstruction methods are employed instead of ground-truth depth labels for the depth estimation task while ground-truth semantic labels are required for the semantic segmentation task. This work uses the KITTI driving dataset to evaluate the effectiveness of the proposed approach. The experimental results show that the proposed approach achieves excellent performance on both tasks, and these indicate that the model can effectively utilize both geometric and semantic information.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom