CS 8803B Artificial Intelligence

Fall 2002

Project 3

Belief Networks

Due: Monday, November 11, 2002, 3:00PM

Introduction

This project is intended to give you a chance to explore Bayesian reasoning and belief networks in particular. Graphical models (meaning static and dynamic belief networks) are a general model of uncertain reasoning in AI, robotics, and computer vision, and understanding how they work is of fundamental importance. Here, we ask you to construct a simple belief network (10 to 20 nodes) of your own design and do some inferences on it. Next, we ask you to explore a bit beyond what was covered in class, and give an example to illustrate some instance of reasoning under uncertainty.

In this project you will be using Matlab (an interactive, matrix-oriented commercial programming language and environment from MathWorks to build a simple belief (or Bayesian) network and perform some simple queries on it. Matlab is available on most CoC machines, as well as the various OIT-run computer labs around the GA Tech campus. You will be asked to design a belief network for a (hypothetical) expert system in an area of your choice, and "implement" it using Matlab. The only programming this project involves is programming in Matlab. Matlab is relatively easy to use, but if you run into trouble, some quick Matlab tutorials are available here and here. Also check out the "getting started" guide available from MathWorks online, available in HTML and PDF (138 pages, 2.6MB). In fact, much of the Matlab documentation is available on-line from MathWorks.

You will be using a software package with Matlab known as the Bayes Net Toolbox (BNT). First, you will need to download the ZIP archive from here. Next, you will need to install it someplace (the complete package takes up about 3.25MB of space). You need Matlab 5.2 or newer to run it (the CoC has version 6.1, and I believe the OIT labs have at least version 5.3, if not something newer). Also note that you CANNOT use this package with Octave (a free Matlab clone), since Octave lacks the features necessary to run it.

Assignment

  1. Choose an area you are familiar with. Some examples include predicting the weather, troubleshooting a car, diagnosing a cold, predicting your grade, etc. These examples are only meant to show you that the area does not have to be a technical one (though it can be, if you prefer). After you choose an area, design a simple belief network with 10 to 20 nodes. Make sure that your network has an interesting topology (e.g. do NOT design a flat or linear network).
  2. Implement your network using BNT in Matlab (instructions below on how to do this). Get familiar with the basic functionalities of BNT (which we describe below), and implement your network.
  3. Perform some queries with your system. For each query, set evidence on some nodes and read off the values of some other nodes. At the very least, you must demonstrate:
  4. Familiarize yourself with some of the other features of BNT, such as other conditional probability distributions for the nodes, learning (distributions or structure), or dynamic Bayesian nets (DBNs). Provide a brief, concrete example illustrating this feature. It would be wise to make sure you understand the mathematics of the features to some degree you're using before you proceed (e.g. by reading the relevant sections of the textbook), as you will have to explain your example in the writeup.

Using BNT for Bayesian reasoning

Here we describe how to use BNT and Matlab to perform Bayesian reasoning on a simple belief network (this example is taken from your textbook, chapter 15--a diagram of the network appears in figure 15.2 on page 439).

First, start Matlab, and load up the BNT toolkit.

>> addpath /net/hc283/yaner/work/bayes/BNT
>> add_BNT_to_path
(Note: '>>' is the Matlab prompt). Next, we must construct an adjacency matrix for the directed acyclic graph corresponding to the belief net. We must number the nodes in topological order (parents before children, grandparents before grandchildren, etc.). Here, we have chosen to use the variables B, E, A, JC, and MC to refer to the nodes Burglary, Earthquake, Alarm, JohnCalls, and MaryCalls, respectively. In Matlab, arrays and matrices are indexed from 1, so the first node must be numbered 1.
>> N = 5;
>> dag = zeros(N,N);
>> B = 1; E = 2; A = 3; JC = 4; MC = 5;
>> dag([B,E],A) = 1;
>> dag(A,[JC,MC]) = 1;
>> dag
dag =
     0     0     1     0     0
     0     0     1     0     0
     0     0     0     1     1
     0     0     0     0     0
     0     0     0     0     0
Next, we construct the actual belief network object. The node_sizes variable refers to the size of the domains of each variable. This example uses boolean nodes, so the size is only 2, but it's possible to have, say, 4 or 6 values in the domain of a variable.
>> node_sizes = 2*ones(1,N);
>> bnet = mk_bnet(dag, node_sizes, 'names', {'B','E','A','JC','MC'});
You can visualize the graph structure of a belief net using the following command:
>> draw_graph(bnet.dag);
This will pop up a window with a drawing of the graph structure of your network. The nodes will be numbered according to your scheme (not labelled), so you must know the numbering scheme to interpret this drawing.

You must number the values of the domains much like you number the nodes in the network, and once again Matlab needs 1-based indexing (more on this below). In our case, we use index 1 to represent false and index 2 to represent true. This is arbitrary, but it must be consistent. The reason for these indices is when we go to specify the conditional probability tables (or CPTs) for each node, our scheme will be used to index the values in the table.

>> false = 1; true = 2;
Here is our first two CPTs, for nodes B and E. We only have a prior probability, here, so we need to specify CPT(false) and CPT(true), yielding a 2-element vector of the form [ P(F) P(T) ]:
>> bnet.CPD{B} = tabular_CPD(bnet, B, [.999, .001]);
>> bnet.CPD{E} = tabular_CPD(bnet, E, [.998, .002]);
Here, the prior P(B) = .001, and the prior P(E) = .002. The CPT for A is more complicated, since it has two parents (B and E). The CPT will take three indices, one for each parent node, and one for the value of the node itself. The order of the indices is the numerical order of the nodes. In this case, B, E, and A are 1, 2, and 3, so the first index corresponds to the value of B, the second the value of E, and the third the value of A. So, for instance CPT(2,2,2) is the conditional probability P(A|B,E), which is .95 in our example. We do not need to specify the 3-dimensional vector, however; we can use a 1-dimensional vector and BNT will "reshape" it for us. Matlab increments the left-most indices first, so the order will be as follows:
B E | A | P(A|B,E)
------------------
F F | F | .999
T F | F | .06
F T | F | .71
T T | F | .05
F F | T | .001
T F | T | .94
F T | T | .29
T T | T | .95
Recall that true (T) is 2 and false (F) is 1. This gives us the CPT for A:
>> bnet.CPD{A} = tabular_CPD(bnet, A, [.999, .06, .71, .05, .001, .94, .29, .95]);
The tables for the last two nodes, JC and MC, can be computed similarly. Since there's only one parent for each of them, the vector will be of length four. For instance, for JC it will contain [ P(~JC|~A) P(~JC|A) P(JC|~A) P(JC|A) ].
>> bnet.CPD{JC} = tabular_CPD(bnet, JC, [.95, .10, .05, .90]);
>> bnet.CPD{MC} = tabular_CPD(bnet, MC, [.99, .30, .01, .70]);
Now for the inference. First we set up the inference engine. Here we're using the junction tree inference method, which is an exact method of inference for any belief net (not just polytrees).
>> engine = jtree_inf_engine(bnet);
Here we set up the evidence for a diagnostic query. A diagnostic inference is a bottom-up inference from effects to causes. Here, we're interested in the probability of a burglary given that John has called, P(B|JC).
>> evidence = cell(1,N);
>> evidence{JC} = true;
The following commands actually compute the answer:
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true)
ans =
    0.0163
The last command returns the answer (1.6%). Actually, marg.T is a table, indexed by the possible domain values for the node B. It is the posterior distribution for the node B given our evidence:
>> marg.T
ans =
    0.9837
    0.0163
Next, we do a causal inference. Causal inferences reason top-down from causes to effects. Here, we ask for the likelihood that Mary calls given that there is a burglary, P(MC|B). The evidence:
>> evidence = cell(1,N);
>> evidence{B} = true;
And the inference:
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, MC);
>> marg.T
ans =
    0.3414
    0.6586
Here, P(MC|B) is 65.9%. A simple intercausal inference is to ask what the likelihood of one causal node having a particular value is, given the value of another. Here, we ask for the likelihood of a burglary given an earthquake (and this simple network):
>> evidence = cell(1,N);
>> evidence{E} = true;
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true)
ans =
   1.0000e-03
The answer returned is 0.1% (low, as we would expect). This is not a terribly interesting query. Often times, intercausal inference is known as "explaining away", and this next example illustrates this pattern more fully. First, we do a simple diagnostic query asking for the probability of a burglary given that the alarm went off. Next, we revise it with evidence that an earthquake occurred. This lowers the answer considerably, hence "explaining away" the alarm.
>> evidence = cell(1,N);
>> evidence{A} = true;
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true)
ans =
    0.3736
We see that the answer is 37.4%. Next, we add the evidence of an earthquake, and recompute:
>> evidence{E} = true;
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true) % returns P(B|A,E)
ans =
    0.0033
The probability drops to 3.3%, hence "explaining away" the alarm.

More details on how to use BNT are availabile here. The Matlab code for these examples can be found in this Matlab file.

Deliverables

The following are due in class on the due date: The following must be e-mailed to the TA at yaner@cc.gatech.edu by the start of class on the due date: a Matlab file (ASCII text file with the extension .m) with the code (commands) for the first part of the assignment (part 3 above), and a similar file for part 4. A Matlab file is a RAW TEXT file (e.g. created with Notepad under windows, or vi or pico in Unix or Linux); HTML, Word documents, or any other format will not be accepted, and points will have to be deducted if the file is submitted in this form. Last modified: Mon Oct 28 20:54:46 EST 2002