Invited Session

 
10:30 to 12:30 - June, 20 (Friday)

 

Go to previous session

Back to main program

Go to next session

 

Challenges in wide-area structure-from-motion

Marc Pollefeys (ETH-Zurich, Switzerland and UNC-Chapel Hill, USA)

In this talk, I will present work on wide-area structure-from-motion (SfM) such as city-wide 3D reconstructions from millions of video frames.  While in recent years a lot of progress has been made in the area of SfM and multi-view stereo reconstruction, wide-area 3D reconstructions and mapping lead to interesting new research challenges.  It is for example important to use algorithms that are efficient as typically millions of frames have to be processed.  In this context we will present algorithms which exploit the tremendous computational power of recent graphic processing units (GPU) to achieve real-time performance.  Another challenge consists of avoiding unbounded accumulation of errors.  For this it is important to close loops when the camera path crosses itself (e.g. at street intersections).  We introduce viewpoint-invariant-patches (VIP) to enable robust and efficient matching over widely varying viewpoints (e.g. orthogonal crossings).  Our approach is illustrated with 3D reconstructions of Chapel Hill (where the last edition of 3DPVT was held).

 

 

 

Marc Pollefeys


Photosynth and Beyond

Drew Steedly (Microsoft Live Labs, USA)

Traditional photo browsing tools allow you browse through 2D pages of thumbnails. Photosynth lets you to browse photo collections in 3D space. This makes it easy to answer questions like "What is to the right of this photo?" and "Is there a more detailed photo of this part of the scene?". The Photosynth viewing experience relies on first automatically reconstructing the camera positions and a sparse point cloud. In this talk, I will discuss some recent enhancements to the viewing experience as well as a tool that allows users to interactively build textured 3D models from Photosynths.

 

 

 

Drew Steedly


Taking Google Maps to Street Level

Jana Kosecka (GMU and Google, USA)

I will describe Google's Streetview feature from the conception of the idea to  initial capture experiments, challenges encoutered along and present and future directions. I will talk about  what it takes to increase the quantity of coverage while maintaining and improving the quality of one the few application where the number of images can be measured in miles.

 

 

 

Jana Kosecka


Fast, Automated, 3D, Airborne Modeling of Large Scale Urban Environments

Avideh Zakhor (UC Berkeley, USA)

3D modeling of large scale environments is of importance in many applications such as city planning, training and simulations, architectural studies, gaming and entertainment, and emergency services. In this talk, we describe an approach to fast, automated, 3D modeling of large scale environments using airborne data only. In contrast with ground based modeling which entails driving on every street of a city, airborne data acquisition can be significantly faster, and hence can scale to much larger areas.  Our basic approach  is to construct the 3D geometry using airborne LiDAR data obtained by an airplane, and to texture map this model using aerial imagery from a helicopter equipped with inexpensive inertial measurement units (IMU). At the core of our approach lies an automated algorithm for texture mapping oblique aerial images onto a 3D model generated from airborne LiDAR data. Our proposed texture mapping algorithm consists of two steps. In the first step, we combine vanishing points and global positioning system aided inertial system readings to roughly estimate the extrinsic parameters of a calibrated camera. In the second step, we refine the coarse estimate of the first step by applying a series of processing steps. Specifically, We extract 2D orthogonal corners (2DOCs) corresponding to orthogonal 3D structural corners as features from both images and the untextured 3D LiDAR model. The correspondence between an image and the 3D model is then performed using Hough transform and generalized M-estimator sample consensus. The resulting 2DOC matches are used in Lowes algorithm to refine camera parameters obtained earlier. Our system achieves 91% correct pose recovery rate for 90 images over the downtown Berkeley area, and overall 61% accuracy rate for 358 images over the residential, downtown and campus portions of the city of Berkeley.

 

 

 

Avideh Zakhor


 
Go to previous session

Back to main program

Go to next session