Thumbnail Image

Benders' Decomposition for Solving the Lagrangian Relaxation of Probabilistic Graphical Models

The goal of this thesis is to develop a novel inference method for graphical model inference problems. Variational inference methods have been widely and successfully used for many computationally challenging statistical inference problems. However, the stopping criterion for variational inference algorithms is that the change in the objective function is small. It is unknown whether the algorithm has truly converged. Benders' decomposition is an optimization method that has the capability to provide upper and lower bounds on the optimal value and thus give a certificate of optimality for variational inference problems. Utilizing Benders' decomposition, we propose a graphical model inference method that provides an optimality guarantee. In the first chapter, we give an introduction of the paper - what motivates our research, our contributions, and organizational overview of the paper. In the second chapter, we review the general context of the problem, comparable inference methods and techniques that we utilize in the development of the method. In the third chapter, we try to develop an algorithm to solve a maximum a-posteriori (MAP) clustering problem under the Gaussian mixture model. We formulate the problem as a mixed-integer nonlinear problem (MINLP), turning it into an optimization problem that can accommodate side constraints. We then prove that the problem satisfies the conditions of Generalized Benders Decomposition (GBD). We explore three different derivations of GBD for the problem: (1) exploiting all explicit constraints, (2) only exploiting constraints over $\bm{\pi}$, and (3) deriving a simple linear form of optimal cuts via Taylor series. None of these approaches, yet, result in a successful algorithm; the flaws of each approach are discussed. In the fourth chapter, we extend the idea from the third chapter, and develop a maximum a-posteriori (MAP) inference method in general Bayesian factor models. Previous study has shown that MAP assignment problem can be relaxed via Lagrangian or linear programming, and its dual form yields a computationally efficient inference algorithm. The question remains is to add back which constrains in to the relaxed problem. We use generalized Bender decomposition to sequentially adds the most optimal constraints to the fully relaxed dual problem. The method guarantees of $\epsilon$-convergence by tightening the gap between upper and lower bounds of the problem. We also show which condition has to be met for the certificate of optimality to be at the global scale; it could be at the local level if the condition is not met. Two Bayesian models, Bayesian Gaussian mixture model (BGMM) and latent Dirichlet allocation (LDA) model, are explored using the proposed method. For each model, we implement the algorithm on standard data sets and compare the results to those of variational Bayes and Gibbs sampler. Our proposed method outperforms variational Bayes and Gibbs sampler on some of the data sets in terms of achieving the higher log(MAP) value, and always end with guaranteed optimality, which other two methods do not provide. We then discuss a few details to consider when applying the method. Lastly, we contribute to the development of dynamic programming algorithms and related experiments in hierarchical clustering for exact inference.
Research Projects
Organizational Units
Journal Issue
Publisher Version
Embedded videos