Approximate discriminative training of graphical models

Jeremy Jancsary
PhD thesis, Institute of Telecommunications, Vienna University of Technology
Reviewers: Gerald Matz, Harald Trost
Vienna, August 2012

Over the past decade, graphical models have emerged as a workhorse for statistical processing of data in disciplines as diverse as computer vision, natural language processing, digital communications and computational biology. Problems from all of these disciplines have in common that the objects of interest possess rich internal structure. Graphical models help us make this structure explicit and exploit it during statistical inference.

The proliferation of freely available data has lead to reinforced interest in approaches that learn from existing examples how to infer the properties of previously unseen instances. Graphical models provide a sound formal framework towards this end—the discriminative learning approach seeks to estimate the parameters of a graphical model such that its predictions are consistent with the observed data. While conceptually simple, the prevalent approaches suffer from computational intractability if the underlying graphical model contains cycles. Unfortunately, this is the case in many applications of practical interest. During recent years, the understanding of approximate inference in graphical models has improved dramatically. Yet, in a learning scenario, intractability is still often dealt with in an ad-hoc or heuristic manner. This thesis aims to aid the goal of bridging this gap.

The first approach we present draws heavily on tools from convex optimization. Based on a variational characterization of the inference problems in graphical models, we present a whole catalogue of equivalent formulations of discriminative learning, each exposing different merits. The idea underlying this approach is to relax the inference subproblems, that is, to optimize over a simpler constraint set. We introduce new algorithms that can be used to solve these relaxed problems more efficiently than previously possible. Alternatively, we demonstrate how the variational viewpoint allows to formulate discriminative learning as a single unconstrained convex optimization problem that can be solved using off-the-shelf solvers.

Our second approach is based on a Gaussian model and allows for treatment of both discrete and continuous learning tasks. Discrete variables can be handled either via a high-dimensional encoding, or by optimizing a specific loss function. While the use of a Gaussian predictive density may seem overly restrictive at first, we demonstrate how the expressiveness of the model can be increased significantly via non-parametric conditioning.

We present applications from computer vision and natural language processing that demonstrate the wide applicability of our algorithms and the importance of employing principled approximations. A highlight among our results is that we obtain the best published numbers for natural image denoising and related image restoration problems. 

PhD thesis

 

 

 

 

Note: I plan to maintain an errata sheet on this page. If you believe you found a mistake, please let me know, and I will update the thesis and acknowledge your contribution.

Please cite as:

@PHDTHESIS{Jancsary2012c,
author = {Jeremy Jancsary},
title = {Approximate Discriminative Training of Graphical Models},
school = {Institute of Telecommunications, Vienna University of Technology},
year = {2012}
}