College of Computing

Georgia Institute of Technology

This course will cover basic and advanced techniques in Probabilistic Graphical Models. The goal of this class is to gain theoretical and practical understanding of these models, which include many standard probability models such as mixture models, Hidden Markov Models, Markov Random Fields, and so forth. We will also describe the application of PGMs to real-world problems, drawn from computer vision and other fields.

MWF 2:00-3:00

ES&T L1105

Jim Rehg

Email: rehg@gatech.edu

Office Hours: MF 4-5

Familiarity with basic probability, graph theory, and linear algebra. This is an introductory course and the lectures will be self-contained.

There are two required texts for this course:

*Probabilistic Graphical Models: Principles and Techniques*by Daphne Koller and Nir Friedman, MIT Press, 2009 (available at GT bookstore)- "Graphical Models, Exponential Families, and Variational Inference" by Martin J. Wainwright and Michael I. Jordan, in
*Foundations and Trends in Machine Learning*, 1(1-2):1-305. DOI: 10.1561/2200000001 [link to article]

Each Mon you will be assigned a short problem set which will be due by midnight the following Sunday evening. You will submit your solutions using T-Square. These problems sets will consist of exercises from the Koller and Friedman book, along with other sources. In addition, there will be three assigned mini-projects. There will also be an open book, in-class final exam.

All work must be submitted by the deadline. Late work will be assessed a penalty of 10 points deducted per day.

- Problem Sets: 25%
- Project 1: 20%
- Project 2: 20%
- Project 3: 20%
- Final Exam: 15%

Each student will do three mini-projects on the following topics:

- Bayes net reasoning
- Topic modeling using LDA
- Image segmentation using MRF's

For each project, we will provide you with some scaffolding code and you will propose a topic and work on it. For example, within LDA you might propose to model a particular text corpus of interest and then use the resulting LDA model to accomplish some task. Or you might propose to compare two approximate learning methods such as variational EM and Gibbs sampling, to see how they perform. We will provide the basic code base, you get to decide how to use it and what to study. We will provide Matlab code to support each of these projects. In addition, we can point you to codes in other languages if you are not interested in using Matlab, or want to work on larger model sizes where Matlab may be too slow.

**8/22: Overview**(slides)- Discussion of course syllabus
- Examples of probabilistic graphical models
- Course goals

**8/24: Bayesian networks**(slides)- Definition of Bayesian networks
- Local Markov properties
- Independence and factorization
- Resources: Code for Student example

**8/26: D-separation**(slides)- Global independencies
- D-separation
- Bayes Ball algorithm
- Soundness of d-separation
- Completeness and Perfect maps

**8/29: Markov networks**(slides)- Definition of Markov networks
- Factors and Gibbs distributions
- Global Markov independencies

**9/5:****No class**(Labor Day)**9/7: Soundness and completeness**(slides)- Soundness of separation
- Completeness
- Perfect maps

**9/9: Variable elimination**(slides)- Inference
- Variable elimination example
- Moralization
- Elimination algorithm
- Graph transformation

**9/12: Cluster graph**(slides)- Elimination graph
- Cluster graph generation
- Cluster graph properties
- Running intersection property

**9/14: Sum-product message passing**(slides)- Sum-product message passing
- Clique tree properties
- Clique tree calibration
- Clique tree invariant

**9/16: Clique tree calibration**- Clique tree invariant
- Belief propagation message passing

**Representation**- Exponential family
- Generalized linear models
- Markov random fields
- Temporal models
- Gauss Markov networks
- Conditional random fields
- Factor graphs

**Inference**- Variational inference and sum-product
- Loopy belief propagation
- Conjugate duality, maximum entropy, maximum likelihood
- Marginal polytope
- Mean field method
- Structured mean field
- Tree-reweighted message passing
- Importance sampling
- Gibbs sampling and MCMC
- MAP inference and max-product
- Loopy max-product and counting numbers
- MAP via graph cuts

**Learning**- Maximum likelihood parameter estimation
- Bayesian learning and conjugate priors
- Variational EM and variational Bayes
- Learning in LDA, MRF, CRF
- Structure learning
- Causality