סמינר של הפקולטה להנדסה ע"ש איבי ואלדר פליישמן

EE Seminar: MonSter - Awakening the Mono in Stereo

19 בפברואר 2020, 15:00

Room 011, Kitot Building

Speaker: Yotam Gil

M.Sc. student under the supervision of Prof. Raja Giryes

Wednesday, February 19^th, 2020 at 15:00

Room 011, Kitot Bldg., Faculty of Engineering

MonSter - Awakening the Mono in Stereo

Abstract

A novel stereo imaging system is presented, achieving enhanced performance over traditional stereo cameras, as well as self calibration abilities.

Stereo imaging is the most common passive method for producing reliable depth maps, however it suffers from two cardinal issues: limited depth range due to disparity resolution and sensitivity to extrinsic calibration. In this thesis, we offer a framework to overcome these limitations with the help of previously ignored information from each of the stereo images. First, we show how a stereo depth-map can be improved by equipping one of the stereo cameras with a phase-coded mask tuned for a range of depths in which the stereo struggles; in these ranges we use the monocular depth map as another source of depth information. This leads to a depth map fusion method that improves the original stereo depth map accuracy by 10%. Second, we present a novel online self-calibration approach, which makes use of both the stereo and monocular depth maps to find the transformation required for calibration by enforcing consistency between both maps. an extrinsic calibration is a crucial step for every stereo based system; despite all the advancements in the field, most calibrations are still done by the same tedious method of a checkerboard target. Monocular based depth estimation methods do not require extrinsic calibration but generally achieve inferior depth accuracy. The proposed method works in a closed-loop and exploits the pre-trained networks' global context, and thus avoids feature matching and outliers issues. In addition to presenting our method using an image-based monocular depth estimation method, which can be implemented in most systems without additional changes, we also show that having a phase-coded aperture mask leads to even better and faster convergence. We demonstrate our method on road scenes from the KITTI vision benchmark and real-world scenes using our prototype camera.