CS 2360 - May 30, 1996

Lecture 20 -- Rule-Based Systems



A programming language is nothing more than a way of 
representing knowledge.  For example, your hexapawn player has 
a lot of knowledge about how to play hexapawn, but it's not 
exactly obvious by looking at the program just what that 
knowledge is, as it's represented in some fairly obscure 
ways.  Some functions return numbers, other functions 
propagate numbers...it's not easy to follow.

This kind of knowledge is often called "how-to" or 
"procedural" knowledge---knowledge about how to perform some 
task.  This is different from "what-is-it" or "declarative" 
knowledge, like that encoded in your "Savannah" network, or 
the inheritance hierarchy for animals.

Procedural knowledge doesn't necessarily have to be encoded 
as cryptically as it is in your hexapawn program.  Another way 
to represent this kind of knowledge is as a set of "if-then 
expressions" or "rules".  These rules take the form:

IF some condition(s) exist THEN perform some action(s)

A set of test-action or if-then rules is often called a 
"production system"---if the left side holds true, the right 
side is produced.

We see production systems used a lot in computer science.  If 
you should take a compiler course, for example, you'll see 
production systems used to describe programming languages:

[expression] := [expression] [arithmetic-op] [expression]

This is the stuff that compilers and compiler generators are 
made of.  It's also the backbone of natural language 
understanding system.  In the world of artificial 
intelligence, production systems are often called "rule-based 
systems", and are used not only for building flexible 
intelligent systems, but also for modelling human cognition.

Components of a rule-based system

A computational implementation of a rule-based system needs 
three parts:

1.  The rule base (also known as the knowledge base or 
    production memory).  This is the place where the rules
    are stored (the procedural knowledge).

2.  The rule interpreter (a.k.a. the control mechanism or the
    inference engine).  This is a means of determining which
    rules have tests that are satisfied, and then selecting
    one of those rules to have its associated action 
    performed.

3.  The data base (a.k.a. working memory, short term memory,
    the world model, or context).  This is an abstracted 
    representation of the world that this system "cares"
    about.  It represents the current state of the world.
    The left-hand sides of the rules look at this to see if
    they're satisfied; the right-hand sides act operate on
    this information.  This is where the facts or declarative
    knowledge is stored.

If you put enough knowledge into the rule base so that it can 
perform some interesting, complex task at the same 
performance level as a human, we could call this an "expert 
system".  You'll learn more about expert systems in CS 3361.

Designing a simple rule-based system for tic-tac-toe

You might get a better feel for rule-based systems if you 
actually go through the design of one.  Here's a half-baked 
attempt at a rule-based system for playing tic-tac-toe.  We 
won't worry much about implementation details; we'll just do 
some handwaving and describe the knowledge in English.

Here are some rules which might be useful, but these aren't 
necessarily all the rules.  Oh, and assume "row" means any 
adjacent squares, whether horizontal, vertical, or diagonal:

IF I have three tokens in a row      THEN I win; quit

IF the opponent has three tokens     THEN I lose; quit
   in a row

IF I have two tokens in a row and    THEN put my token in
   the third square in the row is         the third square
   empty

IF the opponent has two tokens in    THEN put my token in
   a row and the third square in          the third square
   the row is empty

IF the center square is empty        THEN put my token in
                                          the center square

IF a corner square is empty          THEN put my token in 
                                          that corner square

IF a side square is empty            THEN put my token in 
                                          that side square

And so on, and so on.  The first thing to note is that more 
than one of these rules may have tests that are satisfied at 
the same time.  We'll talk about how to select which rule to 
use later on.  The second thing to note is that these rules 
are described at a very high level.  There aren't many 
details, and certainly none that a computer could understand 
without a lot of help.  Nevertheless, we want to maintain 
that high-level description, as one of the things we want to 
accomplish with a rule-based system is to make the procedural 
knowledge more accessible, or more easily understandable, to 
us.  So even as we approach the implementation of something 
like this, we want to make sure our rules look more like:

(if center-square-empty? then put-token-center-square)

and not:

(cond ((null square0) (setf square0 'X)))

To make this happen, we must construct a rule interpreter, 
which interprets the language of if-then rules that describes 
the procedural knowledge for playing tic-tac-toe.  In other 
words, these if-then rules are yet another way of building a 
language on top of a language.

The rule interpreter

The rule interpreter is just a simple loop:

do until no left-hand-side is satisfied

  begin

    find all the rules whose tests (or preconditions or
    antecedents or left-hand-sides) are satisfied

    from that group, use a conflict resolution strategy
    (hang on, we'll get there) to select the most appropriate
    rule to be used

    execute that rule's actions (or postconditions or 
    consequents or right-hand-sides) and update the
    data base accordingly

  end

In addition to all this, you'll need a data base, so...

The data base

In the case of our tic-tac-toe game player, the data base is 
going to be really simple.  We just need to represent the 
current state of a game, so a board that looks like this:

      | |o
     -+-+-
      |x| 
     -+-+-
      | | 

might be represented like this:

((nil nil o)(nil x nil)(nil nil nil))

or one of any of a number of other possible representations.  
But of course, you'd employ such nice data abstraction 
techniques that only some simple accessor and constructor 
functions would have to know the details of the 
representation, right?

Conflict resolution

If you're silly enough to construct your rule-based system to 
be so specific that for every conceivable state of the data 
base there is only one rule that applies, you're going to run 
into some trouble.  First, for big problems, you'll have a 
very big rule base, and big rule bases make for slow 
programs.  Second, this program is not going to be very 
adaptable to new situations.  And third, it's a good 
indication that you're doing it wrong.

For example, a rule in your tic-tac-toe system should most 
definitely not be this specific:

IF the data base is 
   ((nil nil nil)(nil nil nil)(nil nil nil))
THEN change the data base to
   ((nil nil nil)(nil x nil)(nil nil nil))

So, in more general systems, there are going to be lots of 
times when two or more rules apply, but only one can be 
executed (at least on a serial computer).  That's where 
"conflict resolution" comes into play.  There are any number 
of conflict resolution strategies; here are just a few ways 
to select one rule when many rules have had their left-hand-
sides satisfied:

1.  Use the first rule whose left-hand-side was satisfied.
    Thus, ordering of rules is important.  This can get ugly
    in big systems with lots of rules, especially as you add,
    delete, or change rules (e.g., where does that rule go
    now?).

2.  Assign priorities to rules, and use the rule with the 
    highest priority.  But then how do you determine the
    priorities?

3.  Use the most specific rule---the one with the most 
    details or constraints in the left-hand-side.

4.  Use the most recently used (or least recently used)
    rule.

5.  Use the rule which matches the most recently added piece
    of knowledge in the data base.

6.  Choose a rule arbitrarily---flip a computational coin.

7.  Construct multiple copies of the data base and use all
    rules in parallel, with each rule affecting only one copy
    of the data base.  This could, of course, get ugly over
    time.

Does the word "search" ring a bell?

Once we've implemented our little tic-tac-toe rule-based 
system and start it up, what kind of behavior will we see?
Well, we'd start with the initial state, abstractly 
represented like this:

                              | | 
                             -+-+-
                              | | 
                             -+-+-
                              | | 

There are a few rules that might apply:

                              | | 
                             -+-+-
                              | | 
                             -+-+-
                              | | 
                             / | \
                            /  |  \
                           /   |   \
                          /    |    \
                      mark    mark   mark
                     center  corner  side
                       /       |       \
                      /        |        \

The conflict resolution strategy would then select one rule, 
whose action would then be performed on the data base:

                              | | 
                             -+-+-
                              | | 
                             -+-+-
                              | | 
                             / | \
                            /  |  \
                           /   |   \
                          /    |    \
                      mark    mark   mark
                     center  corner  side
                       /       |       \
                      /        |        \
                     | | 
                    -+-+-
                     |x| 
                    -+-+-
                     | | 

The opponent then marks a square:

                              | | 
                             -+-+-
                              | | 
                             -+-+-
                              | | 
                             / | \
                            /  |  \
                           /   |   \
                          /    |    \
                      mark    mark   mark
                     center  corner  side
                       /       |       \
                      /        |        \
                     | | 
                    -+-+-
                     |x| 
                    -+-+-
                     | | 
                      |
                      |
                      |
                    o| | 
                    -+-+-
                     |x| 
                    -+-+-
                     | | 

And then there would be some rules that you could apply:

                              | | 
                             -+-+-
                              | | 
                             -+-+-
                              | | 
                             / | \
                            /  |  \
                           /   |   \
                          /    |    \
                      mark    mark   mark
                     center  corner  side
                       /       |       \
                      /        |        \
                     | | 
                    -+-+-
                     |x| 
                    -+-+-
                     | | 
                      |
                      |
                      |
                    o| | 
                    -+-+-
                     |x| 
                    -+-+-
                     | | 
                    /   \
                   /     \
                  /       \
                 /         \
             mark           mark
            corner          side
              /               \
             /                 \

Does this look familiar?  Well, of course it does!  It's a 
depth-first state space search, guided by heuristics encoded 
as if-then rules.  In other words, this is yet another way of 
looking at stuff we've looked at before.

Advantages and disadvantages of rule-based systems

As with anything we've seen in this class so far, there are 
advantages and disadvantages to representing procedural 
knowledge as rules.  Here are some advantages:

1.  The knowledge is all represented in the same homogeneous
    format, which makes it all easier to understand.

2.  The rules are modular or independent; thus, adding or
    changing knowledge should be easy to do without goofing
    up everything else.

3.  If-then rules are an "intuitive" or "natural" way to 
    represent knowledge; they're easy for us to work with.

However, there are some disadvantages:

1.  Rule-based systems are highly inefficient.  In order to 
    use just one rule, the system must examine all the rules.

2.  Despite our best intentions, hidden dependencies between
    rules may exist.  This makes it hard to predict the 
    effects of adding or deleting a rule.

3.  It's hard to see the flow of control in a rule-based
    system, so they're difficult to design and difficult
    to debug.

Logic as a rule-based system

Finally, in this lecture, we talked briefly about predicate 
calculus or, as some folks like to call it, logic.  You know 
about this stuff from CS 1155, right?  Anyway, doing theorem 
proving using logic is just another form of a rule-based 
system.  There are automated theorem provers, and there's 
even a programming language based on predicate calculus 
called PROLOG (as in "PROgramming with LOGic").  PROLOG 
interpreters can be, and have been, built on top of LISP, and 
the have many of the same advantages and disadvantages that 
other rule-based systems have.



Copyright 1996 by Kurt Eiselt.  All rights reserved.

Last revised: June 2, 1996