Program - Workshop on Multi VIew Geometry in RObotics (MVIGRO)

Workshop Schedule:

15:00 – 15:05

15:05 – 15:45

15:45 – 16:25

16:25 – 17:00

17:00 – 17:40

17:40 – 17:55

17:55 – 18:10

18:10 – 18:25

18:25 – 18:30

Invited talk

Invited talk

Poster session +
Coffee break

Invited talk

Oral talk

Oral talk

Oral talk

Welcome and introduction

Joint Scene Reconstruction and Recognition from Images
Silvio Savarese

Multiview 3D Reconstruction and Semantic Parsing
Jana Kosecka

Details

Statistical pose averaging with varying and non-isotropic covariances
Roberto Tron and Kostas Daniilidis

ORB-SLAM: Tracking and Mapping Recognizable Features
Raul Mur Artal and Juan D. Tardos

Real-time Dense Stereoscopic Visual Odometry
Sammy Omari, Michael Burri, Michael Bloesch, Markus Achtelik, Pascal Gohl and Roland Siegwart

A Convex Formulation for Motion Estimation using Visual and Inertial Sensors
Mingyang Li and Anastasios Mourikis

Concluding remarks

Poster Session:

Pose averaging for registration of multiple heterogeneous views
Roberto Tron, Philip Osteen, Jason Owens and Kostas Daniilidis

3D Reconstruction of Superpixels and its Use in Monocular SLAM
Alejo Concha and Javier Civera

Towards measuring uncertainty in volumetric signed distance function representations for active SLAM
Henry Carrillo, Yasir Latif, José Neira and José Castellanos

MEVO: Multi-Environment Stereo Visual Odometry
Thomas Koletschka, Luis Puig and Kostas Daniilidis

Abstracts of Invited Talks:

Joint Scene Reconstruction and Recognition from Images, Silvio Savarese

When we look at an environment such as a coffee shop, we don't just recognize the objects in isolation, but rather perceive a rich scenery of the 3D space, its objects and all the relations among them. This allows us to effortlessly navigate through the environment, or to interact and manipulate objects in the scene with amazing precision. A major line of work from my group in recent years has been to design intelligent visual models that understand the 3D world by integrating 2D and 3D cues, inspired by what humans do. In this talk I will introduce a novel paradigm whereby objects and 3D space are modeled in a joint fashion to achieve a coherent and rich interpretation of the environment. I will start by giving an overview of our research for detecting objects and determining their geometric properties such as 3D location, pose or shape. Then, I will demonstrate that these detection methods play a critical role for modeling the interplay between objects and space which, in turn, enable simultaneous semantic reasoning and 3D scene reconstruction. I will conclude this talk by demonstrating that our novel paradigm for scene understanding is potentially transformative in application areas such as autonomous or assisted navigation, robotics, augmented reality, automatic 3D modeling of urban environments and surveillance.

Multiview 3D Reconstruction and Semantic Parsing, Jana Kosecka

Future advancements in robotic navigation, mapping, object search and exploration rest to a large extent on robust, efficient and scalable 3D modeling and semantic understanding of the surrounding environment and interplay between the two. I will discuss these functionalities in the context of indoors and outdoors environments and describe approaches which exploit the constraints of these environments and robotic tasks making both geometric and semantic modeling better conditioned, robust and efficient.

I will first discuss a multiview 3D dense reconstruction approach utilizing properties of piecewise planarity and restricted number of plane orientations. The problem is formulated on an image pre-segmented into superpixels and cast as photo-consistency guided
optimal labeling problem in MRF framework. The approach demonstrates superior performance in difficult scenarios containing many repetitive structures and no or low textured regions. Given the coarse 3D structure of the environment, I will then describe an efficient approach for inferring semantic categories of ground, structure, furniture and props categories in indoors or ground, sky, building, vegetation and objects in outdoors setting as well as further refinement into object subcategories. The proposed approach naturally lends itself to multi-view as well an on-line recursive belief updates. Extensive evaluation on publicly available benchmark datasets will be presented.

Statistical pose averaging with varying and non-isotropic covariances, Roberto Tron and Kostas Daniilidis

In the last few years there has been a growing interest in optimization methods for averaging pose measurements between a set of cameras or objects (obtained, for instance, using epipolar geometry or pose estimation). However, many existing approaches do not take into consideration the fact that the measurements might have different uncertainties, and that the noise might not be isotropically distributed. Here, we propose a Riemannian optimization framework which incorporates estimates of the covariance matrices. We first estimate relative rigid body transformations between pairs of references and their uncertainties or ambiguities. Then we set up a graphical model describing the joint probability of the rotations and translations and we optimize the cost function associated with this
graphical model. We show examples from heterogeneous sensor calibration and from fiducial-based localization.