Feature Space Optimization for Semantic Video Segmentation

Abstract

We present an approach to long-range spatio-temporal regularization in semantic video segmentation. Temporal regularization in video is challenging because both the camera and the scene may be in motion. Thus Euclidean distance in the space-time volume is not a good proxy for correspondence. We optimize the mapping of pixels to a Euclidean feature space so as to minimize distances between corresponding points. Structured prediction is performed by a dense CRF that operates on the optimized features. Experimental results demonstrate that the presented approach increases the accuracy and temporal consistency of semantic video segmentation.

Paper

Abhijit Kundu, Vibhav Vineet, and Vladlen Koltun. Feature Space Optimization for Semantic Video Segmentation. CVPR 2016 (full oral)

Videos

Results on Cityscapes dataset
Comparison on Camvid dataset

Downloads

Reference source code can be downloaded from https://bitbucket.org/infinitei/videoparsing.

A curated version of the Camvid dataset used in the paper can be downloaded from here (15GB).