In this project you will be using Matlab (an interactive, matrix-oriented commercial programming language and environment from MathWorks to build a simple belief (or Bayesian) network and perform some simple queries on it. Matlab is available on most CoC machines, as well as the various OIT-run computer labs around the GA Tech campus. You will be asked to design a belief network for a (hypothetical) expert system in an area of your choice, and "implement" it using Matlab. The only programming this project involves is programming in Matlab. Matlab is relatively easy to use, but if you run into trouble, some quick Matlab tutorials are available here and here. Also check out the "getting started" guide available from MathWorks online, available in HTML and PDF (138 pages, 2.6MB). In fact, much of the Matlab documentation is available on-line from MathWorks.
You will be using a software package with Matlab known as the Bayes Net Toolbox (BNT). First, you will need to download the ZIP archive from here. Next, you will need to install it someplace (the complete package takes up about 3.25MB of space). You need Matlab 5.2 or newer to run it (the CoC has version 6.1, and I believe the OIT labs have at least version 5.3, if not something newer). Also note that you CANNOT use this package with Octave (a free Matlab clone), since Octave lacks the features necessary to run it.
First, start Matlab, and load up the BNT toolkit.
>> addpath /net/hc283/yaner/work/bayes/BNT >> add_BNT_to_path(Note: '>>' is the Matlab prompt). Next, we must construct an adjacency matrix for the directed acyclic graph corresponding to the belief net. We must number the nodes in topological order (parents before children, grandparents before grandchildren, etc.). Here, we have chosen to use the variables B, E, A, JC, and MC to refer to the nodes Burglary, Earthquake, Alarm, JohnCalls, and MaryCalls, respectively. In Matlab, arrays and matrices are indexed from 1, so the first node must be numbered 1.
>> N = 5;
>> dag = zeros(N,N);
>> B = 1; E = 2; A = 3; JC = 4; MC = 5;
>> dag([B,E],A) = 1;
>> dag(A,[JC,MC]) = 1;
>> dag
dag =
0 0 1 0 0
0 0 1 0 0
0 0 0 1 1
0 0 0 0 0
0 0 0 0 0
Next, we construct the actual belief network object. The node_sizes variable refers to
the size of the domains of each variable. This example uses boolean nodes, so the size
is only 2, but it's possible to have, say, 4 or 6 values in the domain of a variable.
>> node_sizes = 2*ones(1,N);
>> bnet = mk_bnet(dag, node_sizes, 'names', {'B','E','A','JC','MC'});
You can visualize the graph structure of a belief net using the following command:
>> draw_graph(bnet.dag);This will pop up a window with a drawing of the graph structure of your network. The nodes will be numbered according to your scheme (not labelled), so you must know the numbering scheme to interpret this drawing.
You must number the values of the domains much like you number the nodes in the network, and once again Matlab needs 1-based indexing (more on this below). In our case, we use index 1 to represent false and index 2 to represent true. This is arbitrary, but it must be consistent. The reason for these indices is when we go to specify the conditional probability tables (or CPTs) for each node, our scheme will be used to index the values in the table.
>> false = 1; true = 2;Here is our first two CPTs, for nodes B and E. We only have a prior probability, here, so we need to specify CPT(false) and CPT(true), yielding a 2-element vector of the form [ P(F) P(T) ]:
>> bnet.CPD{B} = tabular_CPD(bnet, B, [.999, .001]);
>> bnet.CPD{E} = tabular_CPD(bnet, E, [.998, .002]);
Here, the prior P(B) = .001, and the prior P(E) = .002. The CPT for A is more complicated,
since it has two parents (B and E). The CPT will take three indices, one for each parent
node, and one for the value of the node itself. The order of the indices is the numerical
order of the nodes. In this case, B, E, and A are 1, 2, and 3, so the first index corresponds
to the value of B, the second the value of E, and the third the value of A. So, for instance
CPT(2,2,2) is the conditional probability P(A|B,E), which is .95 in our example. We do not
need to specify the 3-dimensional vector, however; we can use a 1-dimensional vector and BNT
will "reshape" it for us. Matlab increments the left-most indices first, so the order will
be as follows:
B E | A | P(A|B,E) ------------------ F F | F | .999 T F | F | .06 F T | F | .71 T T | F | .05 F F | T | .001 T F | T | .94 F T | T | .29 T T | T | .95Recall that true (T) is 2 and false (F) is 1. This gives us the CPT for A:
>> bnet.CPD{A} = tabular_CPD(bnet, A, [.999, .06, .71, .05, .001, .94, .29, .95]);
The tables for the last two nodes, JC and MC, can be computed similarly. Since
there's only one parent for each of them, the vector will be of length four.
For instance, for JC it will contain [ P(~JC|~A) P(~JC|A) P(JC|~A) P(JC|A) ].
>> bnet.CPD{JC} = tabular_CPD(bnet, JC, [.95, .10, .05, .90]);
>> bnet.CPD{MC} = tabular_CPD(bnet, MC, [.99, .30, .01, .70]);
Now for the inference. First we set up the inference engine. Here we're using
the junction tree inference method, which is an exact method of inference for
any belief net (not just polytrees).
>> engine = jtree_inf_engine(bnet);Here we set up the evidence for a diagnostic query. A diagnostic inference is a bottom-up inference from effects to causes. Here, we're interested in the probability of a burglary given that John has called, P(B|JC).
>> evidence = cell(1,N);
>> evidence{JC} = true;
The following commands actually compute the answer:
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true)
ans =
0.0163
The last command returns the answer (1.6%). Actually, marg.T is a table, indexed
by the possible domain values for the node B. It is the posterior distribution
for the node B given our evidence:
>> marg.T
ans =
0.9837
0.0163
Next, we do a causal inference. Causal inferences reason top-down
from causes to effects. Here, we ask for the likelihood that Mary calls
given that there is a burglary, P(MC|B). The evidence:
>> evidence = cell(1,N);
>> evidence{B} = true;
And the inference:
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, MC);
>> marg.T
ans =
0.3414
0.6586
Here, P(MC|B) is 65.9%. A simple intercausal inference is to ask what the likelihood
of one causal node having a particular value is, given the value of another. Here,
we ask for the likelihood of a burglary given an earthquake (and this simple network):
>> evidence = cell(1,N);
>> evidence{E} = true;
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true)
ans =
1.0000e-03
The answer returned is 0.1% (low, as we would expect). This is not a terribly interesting
query. Often times, intercausal inference is known as "explaining away", and this next
example illustrates this pattern more fully. First, we do a simple diagnostic query asking
for the probability of a burglary given that the alarm went off. Next, we revise it with
evidence that an earthquake occurred. This lowers the answer considerably, hence "explaining
away" the alarm.
>> evidence = cell(1,N);
>> evidence{A} = true;
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true)
ans =
0.3736
We see that the answer is 37.4%. Next, we add the evidence of an earthquake, and recompute:
>> evidence{E} = true;
>> [engine, loglik] = enter_evidence(engine, evidence);
>> marg = marginal_nodes(engine, B);
>> marg.T(true) % returns P(B|A,E)
ans =
0.0033
The probability drops to 3.3%, hence "explaining away" the alarm.
More details on how to use BNT are availabile here. The Matlab code for these examples can be found in this Matlab file.