S. Nowozin, J. Jancsary, P. V. Gehler, C. H. Lampert (editors)
The goal of structured prediction is to build machine learning models that predict relational information that itself has structure, such as being composed of multiple interrelated parts. These models, which reflect prior knowledge, task-specific relations, and constraints, are used in fields including computer vision, speech recognition, natural language processing, and computational biology. They can carry out such tasks as predicting a natural language sentence, or segmenting an image into meaningful components.
These models are expressive and powerful, but exact computation is often intractable. A broad research effort in recent years has aimed at designing structured prediction models and approximate inference and learning procedures that are computationally efficient. This volume offers an overview of this recent research in order to make the work accessible to a broader research community. The chapters, by leading researchers in the field, cover a range of topics, including research trends, the linear programming relaxation approach, innovations in probabilistic modeling, recent theoretical progress, and resource-aware learning.
I am happy to announce that our reference implementation of the Regression Tree Fields model is now publicly available on the Microsoft Research downloads website:
The code has been used to obtain state-of-the-art results in low-level image processing tasks such as natural image denoising and natural image deblurring, but has meanwhile also become useful for discrete labeling tasks such as semantic segmentation and even part-of-speech tagging.
We put a lot of effort into the release and hope that the academic community will find it useful. Enjoy! – Jeremy
U. Schmidt, C. Rother, S. Nowozin, J. Jancsary and S. Roth
Non-blind deblurring is an integral component of blind approaches for removing image blur due to camera shake. Even though learning-based deblurring methods exist, they have been limited to the generative case and are computationally expensive. To this date, manually-defined models are thus most widely used, though limiting the attained restoration quality. We address this gap by proposing a discriminative approach for non-blind deblurring. One key challenge is that the blur kernel in use at test time is not known in advance. To address this, we analyze existing approaches that use half-quadratic regularization. From this analysis, we derive a discriminative model cascade for image deblurring. Our cascade model consists of a Gaussian CRF at each stage, based on the recently introduced regression tree fields. We train our model by loss minimization and use synthetically generated blur kernels to generate training data. Our experiments show that the proposed approach is efficient and yields state-of-the-art restoration quality on images corrupted with synthetic and real blur.
J. Jancsary, S. Nowozin and C. Rother
After a decade of rapid progress in image denoising, recent methods seem to have reached a performance limit. Nonetheless, we find that state-of-the-art denoising methods are visually clearly distinguishable and possess complementary strengths and failure modes. Motivated by this observation, we introduce a powerful non-parametric image restoration framework based on Regression Tree Fields (RTF). Our restoration model is a densely-connected tractable conditional random field that leverages existing methods to produce an image-dependent, globally consistent prediction. We estimate the conditional structure and parameters of our model from training data so as to directly optimize for popular performance measures. In terms of peak signal-to-noise-ratio (PSNR), our model improves on the best published denoising method by at least 0.26dB across a range of noise levels. Our most practical variant still yields statistically significant improvements, yet is over 20x faster than the strongest competitor. Our approach is well-suited for many more image restoration and low-level vision problems, as evidenced by substantial gains in tasks such as removal of JPEG blocking artefacts.
J. Jancsary, S. Nowozin, T. Sharp and C. Rother
We introduce Regression Tree Fields (RTFs), a fully conditional random field model for image labeling problems. RTFs gain their expressive power from the use of nonparametric regression trees that specify a tractable Gaussian random field, thereby ensuring globally consistent predictions. Our approach improves on the recently introduced decision tree field (DTF) model  in three key ways: (i) RTFs have tractable test-time inference, making efficient optimal predictions feasible and orders of magnitude faster than for DTFs, (ii) RTFs can be applied to both discrete and continuous vector-valued labeling tasks, and (iii) the entire model, including the structure of the regression trees and energy function parameters, can be efficiently and jointly learned from training data. We demonstrate the expressive power and flexibility of the RTF model on a wide variety of tasks, including inpainting, colorization, denoising, and joint detection and registration. We achieve excellent predictive performance which is on par with, or even surpassing, DTFs on all tasks where a comparison is possible.
J. Jancsary and G. Matz
We investigate minimization of tree-reweighted free energies for the purpose of obtaining approximate marginal probabilities and upper bounds on the partition function of cyclic graphical models. The solvers we present for this problem work by directly tightening tree-reweighted upper bounds. As a result, they are particularly efficient for tree-reweighted energies arising from a small number of spanning trees. While this assumption may seem restrictive at first, we show how small sets of trees can be constructed in a principled manner. An appealing property of our algorithms, which results from the problem decomposition, is that they are embarassingly parallel. In contrast to the original message passing algorithm introduced for this problem, we obtain global convergence guarantees.