CS 2360 - February 13, 1997

Lecture 12 -- Heuristic Search


Making your search smarter

Searches like what we've seen so far are, in a word, dumb.  
They don't know which next state might be any better than any 
other next state.  These searches can be methodical (e.g., 
look at the first on the list) or random (e.g., Calvin's 
decision: "Arbitrarily, I choose left.").  These searches 
settle for finding the goal state, but they don't care about 
how many steps it takes to get from the initial state to the 
goal state.

Usually, however, we don't have time to burn.  We'd like to 
strive to find the goal state in as few steps as we can.  
That is, we'd like to try to find the "optimal path" from the 
initial state to the goal state, and we can help ourselves 
out here if we can put a little more "intelligence" into our 
search.

One time-honored way of doing this is to find a method to 
measure the "goodness" of a state -- that is, to determine 
how close a given state is to the goal state.  If we could 
make that evaluation consistently and correctly, then when we 
look at a list of states in trying to decide which to use 
next to generate new states, we could pick the state closest 
to the goal, instead of just picking the first one we see or 
picking one at random.

Most of the time, though, such measurements of a state's 
goodness are just estimates.  If the estimate is wrong, you 
could spend a lot of time and effort searching paths that 
will never get you to the goal, or at least that will give 
you less optimal solutions.  The better the ability to 
estimate goodness, the better is the chance for optimality.  
But unless the estimate is always right, there's no guarantee 
of success.  These measures of goodness are one example of 
something called "heuristics":  techniques that aid in the 
discovery of solutions to problems most of the time, but 
don't guarantee that they'll lead to solutions all of the 
time.

Heuristics and the 8-tile puzzle

Let's look again at the 8-tile puzzle we saw earlier.
There we described a dumb, exhaustive, brute-force, depth-
first search for finding the goal state.  Could you do 
better?  Probably yes.  If you could come up with a way to 
estimate how close any given arrangement of tiles was to the 
goal, you could always choose to explore the state that was 
nearest the goal.  To do this, you'd have to figure out a way 
to codify the metrics for this evaluation in such a way that 
a computer could use them.  One heuristic might be to just 
count the number of tiles that are in the place they belong.  
So if your goal state looks like this:

                  1 2 3
                  8   4
                  7 6 5

and your start state followed by the next possible states 
looks like this:

                  2 8 3
                  1 6 4
                  7   5
                   /|\
                  / | \
                 /  |  \
                /   |   \
               /    |    \
              /     |     \
             /      |      \
            /       |       \
          2 8 3   2 8 3   2 8 3
          1 6 4   1   4   1 6 4
            7 5   7 6 5   7 5

          score   score   score
            3       5       3

which of these next states is closer to the goal using our 
heuristic?  The middle state has five tiles in the right 
place, while the other two states have only three tiles in 
the right place.  So for our next step in the search, we'd 
choose to generate all the states possible from that middle 
state.  Then we'd apply our evaluation heuristic again, and 
so on.  Of course, we could get more sophisticated with our 
heuristic measures.  For example, we could try to estimate 
how many moves it would take to get all the tiles in their 
appropriate places instead of just counting how many were 
already in the right place.  That might give us a better 
measure of goodness, or it might just cause us to spend extra 
time computing the goodness without any real return on the 
investment, or it might just completely mislead the search.  
We'd have to play with it for awhile to see if it would help 
us.


Game Search

For a single agent in a relatively non-hostile world, the 
search for the path from some start state to some goal state 
is not especially difficult, and in fact, it's sort of dull.  
But the world isn't always peaceful---sometimes, there are 
other agents out there trying to keep you from reaching your 
goal, and at the same time those other agents are trying to 
achieve some goal of their own.  We see this kind of stuff 
everywhere:  in the executive boardroom, on the athletic 
field, sometimes even in the classroom.  And when we try to 
model this kind of competitive behavior on a computer, we 
have to keep in mind that while our computerized "good guy" 
is going to try to move toward the goal in an optimal a 
fashion as possible, the computerized "bad guy" is going to 
everything it can to keep us from getting there.  Thus, the 
question in a competitive or adversarial situation is no 
longer "what's my optimal path to the goal?", but is instead 
"what's my path to the goal when someone else is trying stop 
me?"

The fundamental change in the nature of that question results 
in a fundamental change to the way we do state-space search 
in adversarial situations, thus giving rise to something 
called "adversarial search".  And since we frequently use 
this kind of search when we build intelligent game-playing 
programs, this kind of search is frequently called "game 
search".

The principle of game search is to first generate the state 
space (or "game tree") some number of levels deep, where each 
level corresponds to one player's move (or, more accurately, 
the set of all moves that the player could possibly make at 
that point).  After generating the state space for that 
number of levels, the nodes at the bottom level are evaluated 
for goodness.  (In the context of game playing, those nodes 
are often called "boards", each one representing one possible 
legal arrangement of game pieces on the game board.)

The estimate of the goodness of a board is a little bit 
different than before, but not much.  Since we have to worry 
about the opponent, we set up our estimation function so that 
it returns a spectrum of values, just like before, but now 
the two extremes are boards that are great for us (i.e., we 
win) and boards that are great for our opponent (i.e., we 
lose).  We apply our estimation function to those lowest 
level boards, and propagate the numerical values upward to 
help us determine which is the best move to make.

The joy of hex

In order to explore this wild and wacky world of adversarial 
or game search, we're going to have to introduce a game.  
It's a simple game for two players, and it's called hexapawn, 
for reasons which will become obvious.

The rules of hexapawn (at least in it's original form), are 
as follows:

The game is played on a 3x3 board.  Each player begins with 
three pawns lined up on opposite sides of the board.  There 
are three white pawns and three black pawns, which gives us a 
grand total of six pawns, hence the name hexapawn.  White 
always moves first, just like in chess.  The players take 
turns moving their pawns.  A pawn can move one square forward 
to an empty square, or it can move one square diagonally 
ahead (either to the left or right) to a square occupied by 
an opponent's pawn, in which case the opponent's pawn is 
removed from the board.  One player wins when one of these 
three conditions is true:  1) one of that player's pawns has 
reached the opposite end of the board, 2) the opponent's 
pawns have all been removed from the board, or 3) it's the 
opponent's turn to move but the opponent can't move any 
pawns.

It's not a very exciting game for human players, but it's 
reasonably stimulating for humans who are required to write 
programs to get computers to play this game, such as 
yourselves.  (It's not entirely clear how the computers feel 
about it.)  Hexapawn also serves as a very nice mechanism for 
demonstrating the principles of game search with heuristics.



Copyright 1997 by Kurt Eiselt.  All rights reserved.

Last revised: February 18, 1997