Large-scale (Deep) Learning for GeoData Analysis

segunda-feira, 23 Outubro, 2017 - 15:00
Dr. Jan Dirk Wegner, ETH Zürich
Jan Dirk Wegner received the Dipl.-Ing. (Univ.) degree in Geodesy and Geoinformatics in 2007 and the Dr.-Ing. degree in 2011, both from Leibniz Universität Hannover. In his doctoral thesis, which he passed with distinction, he worked on probabilistic models to combine optical and synthetic aperture radar images for object segmentation. Since 2012 he is senior researcher in the Photogrammetry and Remote Sensing group at ETH Zurich. He was granted multiple awards, among others an ETH Postdoctoral fellowship, the science award of the German Geodetic Commission, a SNSF short visit grant to work at Caltech, and a scientific visit grant of Université Paris Est. His main interests are in the interdisciplinary field of computer vision, machine learning, remote sensing, and photogrammetry. He is actively involved in multiple different projects that aim at better understanding and modeling our environment at large scale, combining vision and machine learning.

The ever-increasing amount of geocoded images at varying scale, viewpoint, and temporal resolution provides a treasure trove of information for better understanding our environment, to help making better decisions, managing resources, and improve quality of life, particularly in big cities. Geospatial computer vision combines vision and machine learning techniques that scale, to solve real-world problems. In this talk I will present three ongoing projects:

(a) 3D city models: Large-scale semantic 3D reconstruction casts 3D modeling and semantic labeling as a joint problem, where semantic image segmentation enforces class-dependent, geometric priors for reconstruction, while 3D benefits semantic labelling via one joint, convex energy formulation. This leads to more accurate 3D city models that come with category labels directly.

(b) Maps: Semantic segmentation is a fundamental remote sensing task, and most state-of-the-art methods rely on CNNs as their workhorse. A major reason for their success is that deep networks learn to accumulate contextual information over very large windows (receptive fields). However, this success comes at a cost, since the associated loss of effective spa-

tial resolution washes out high-frequency details and leads to blurry object boundaries. We propose to counter this effect by combining semantic segmentation with semantically informed edge detection, thus making class boundaries explicit in the model. We present an end-to-end trainable CNN for semantic segmentation with built-in awareness of semantically meaningful boundaries.


(c) Trees: This project, in collaboration with the Caltech Computational Vision Lab, MIT and USDA/Forest Service, aims to automatically catalogue trees in public space, classify them at species level, estimate their stress level and measure their trunk diameter, to support urban and ecological planning. We propose an automated, image-based system to build up-to-date tree inventories at large scale using publicly available aerial images, panoramas at street-level, and open GIS data of US cities.