Making your search smarter
Searches like what we've seen so far are, in a word, dumb.
They don't know which next state might be any better than any
other next state. These searches can be methodical (e.g.,
look at the first on the list) or random (e.g., Calvin's
decision: "Arbitrarily, I choose left."). These searches
settle for finding the goal state, but they don't care about
how many steps it takes to get from the initial state to the
goal state.
Usually, however, we don't have time to burn. We'd like to
strive to find the goal state in as few steps as we can.
That is, we'd like to try to find the "optimal path" from the
initial state to the goal state, and we can help ourselves
out here if we can put a little more "intelligence" into our
search.
One time-honored way of doing this is to find a method to
measure the "goodness" of a state -- that is, to determine
how close a given state is to the goal state. If we could
make that evaluation consistently and correctly, then when we
look at a list of states in trying to decide which to use
next to generate new states, we could pick the state closest
to the goal, instead of just picking the first one we see or
picking one at random.
Most of the time, though, such measurements of a state's
goodness are just estimates. If the estimate is wrong, you
could spend a lot of time and effort searching paths that
will never get you to the goal, or at least that will give
you less optimal solutions. The better the ability to
estimate goodness, the better is the chance for optimality.
But unless the estimate is always right, there's no guarantee
of success. These measures of goodness are one example of
something called "heuristics": techniques that aid in the
discovery of solutions to problems most of the time, but
don't guarantee that they'll lead to solutions all of the
time.
Heuristics and the 8-tile puzzle
Let's look again at the 8-tile puzzle we saw earlier.
There we described a dumb, exhaustive, brute-force, depth-
first search for finding the goal state. Could you do
better? Probably yes. If you could come up with a way to
estimate how close any given arrangement of tiles was to the
goal, you could always choose to explore the state that was
nearest the goal. To do this, you'd have to figure out a way
to codify the metrics for this evaluation in such a way that
a computer could use them. One heuristic might be to just
count the number of tiles that are in the place they belong.
So if your goal state looks like this:
1 2 3
8 4
7 6 5
and your start state followed by the next possible states
looks like this:
2 8 3
1 6 4
7 5
/|\
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
2 8 3 2 8 3 2 8 3
1 6 4 1 4 1 6 4
7 5 7 6 5 7 5
score score score
3 6 3
which of these next states is closer to the goal using our
heuristic? The middle state has six tiles in the right
place (this assumes that we're going to count the empty space as
a tile), while the other two states have only three tiles in
the right place. So for our next step in the search, we'd
choose to generate all the states possible from that middle
state. Then we'd apply our evaluation heuristic again, and
so on. Of course, we could get more sophisticated with our
heuristic measures. For example, we could try to estimate
how many moves it would take to get all the tiles in their
appropriate places instead of just counting how many were
already in the right place. That might give us a better
measure of goodness, or it might just cause us to spend extra
time computing the goodness without any real return on the
investment, or it might just completely mislead the search.
We'd have to play with it for awhile to see if it would help
us.
Game Search
For a single agent in a relatively non-hostile world, the
search for the path from some start state to some goal state
is not especially difficult, and in fact, it's sort of dull.
But the world isn't always peaceful---sometimes, there are
other agents out there trying to keep you from reaching your
goal, and at the same time those other agents are trying to
achieve some goal of their own. We see this kind of stuff
everywhere: in the executive boardroom, on the athletic
field, sometimes even in the classroom. And when we try to
model this kind of competitive behavior on a computer, we
have to keep in mind that while our computerized "good guy"
is going to try to move toward the goal in as optimal a
fashion as possible, the computerized "bad guy" is going to do
everything it can to keep us from getting there. Thus, the
question in a competitive or adversarial situation is no
longer "what's my optimal path to the goal?", but is instead
"what's my path to the goal when someone else is trying stop
me?"
The fundamental change in the nature of that question results
in a fundamental change to the way we do state-space search
in adversarial situations, thus giving rise to something
called "adversarial search". And since we frequently use
this kind of search when we build intelligent game-playing
programs, this kind of search is frequently called "game
search".
The principle of game search is to first generate the state
space (or "game tree") some number of levels deep, where each
level corresponds to one player's move (or, more accurately,
the set of all moves that the player could possibly make at
that point). After generating the state space for that
number of levels, the nodes at the bottom level are evaluated
for goodness. (In the context of game playing, those nodes
are often called "boards", each one representing one possible
legal arrangement of game pieces on the game board.)
The estimate of the goodness of a board is a little bit
different than before, but not much. Since we have to worry
about the opponent, we set up our estimation function so that
it returns a spectrum of values, just like before, but now
the two extremes are boards that are great for us (i.e., we
win) and boards that are great for our opponent (i.e., we
lose). We apply our estimation function to those lowest
level boards, and propagate the numerical values upward to
help us determine which is the best move to make.
The joy of hex
In order to explore the wild and wacky world of adversarial
or game search, we're going to have to introduce a game.
It's a simple game for two players, and it's called hexapawn,
for reasons which will become obvious.
The rules of hexapawn (at least in it's original form), are
as follows:
The game is played on a 3x3 board. Each player begins with
three pawns lined up on opposite sides of the board. There
are three white pawns and three black pawns, which gives us a
grand total of six pawns, hence the name hexapawn. White
always moves first, just like in chess. The players take
turns moving their pawns. A pawn can move one square forward
to an empty square, or it can move one square diagonally
ahead (either to the left or right) to a square occupied by
an opponent's pawn, in which case the opponent's pawn is
removed from the board. One player wins when one of these
three conditions is true: 1) one of that player's pawns has
reached the opposite end of the board, 2) the opponent's
pawns have all been removed from the board, or 3) it's the
opponent's turn to move but the opponent can't move any
pawns.
It's not a very exciting game for human players, but it's
reasonably stimulating for humans who are required to write
programs to get computers to play this game, such as
yourselves. (It's not entirely clear how the computers feel
about it.) Hexapawn also serves as a very nice mechanism for
demonstrating the principles of game search with heuristics.
Hexapawn: catch the fever
Let's look at the beginning of a sample game of hexapawn and
see how we might get a computer to play the game. We'll let
our opponent take the side of the white pawns, and we'll play
the black pawns. The initial board configuration will be
represented like this:
W W W
- - -
B B B
As we said, white always gets to move first. This gives
white three possible initial moves, which are represented in
this way:
W W W
- - -
B B B
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
- W W W - W W W -
W - - - W - - - W
B B B B B B B B B
But of course, white doesn't get to make all three moves.
White has to choose one and go with it. So let's say that
white opts for that move on the left. Our resulting state
space then looks like this:
W W W
- - -
B B B
/
/
/
/
/
/
- W W
W - -
B B B
Everything was fine up until now. Now it's our turn. What
will we do? Well, what we'd like to do is look at all of our
possible next moves and make the best one, right? Sure. So
let's see what our options are on this move:
W W W
- - -
B B B
/
/
/
/
/
/
- W W
W - -
B B B
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
- W W - W W - W W
B - - W B - W - B
B - B B - B B B -
Can we tell from just this which possible next move is the
best one? Maybe. That one on the left looks sort of nice,
since it leaves us with one more pawn than our opponent. But
we really can't tell just by looking at these different
boards which move is likely to lead to a win for us. Maybe
we could get a better idea of which of our three possible
moves is the best one by looking even further ahead to see
what white might do on the next turn:
W W W
- - -
B B B
/
/
/
/
/
/
- W W
W - -
B B B
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
/ | \
- W W - W W - W W
B - - W B - W - B
B - B B - B B B -
/ | \ / \ / | \
/ | \ / \ / | \
/ | \ / \ / | \
/ | \ / \ / | \
- - W - - W - W - - W - - W - - W W - - W - - W
W - - B W - B - W W B W W W - - - B W W B W - W
B - B B - B B - B B - B B - B B W - B B - B B -
^ ^
| |
white white
wins wins
Well now, maybe we know a little more than before. In fact,
we can see two boards there that indicate victory for our
opponent, and we could probably make some reasonable attempts
at estimating how close the other boards might be to either a
win or a loss for us. However, we could also look ahead yet
another move, and then another, and so on until we had mapped
out all the possibilities. The problem with doing this is
that it's going to cost us lots of computational resources.
This may not be a big deal when we're playing hexapawn with
only three pawns on a side, but it will be a big deal if we
extended the game to eight pawns on a side, for example. Or
maybe instead of hexapawn variations, we're playing something
like chess. Now the computational expense will be far too
prohibitive, so we're going to have to settle on some
arbitrary cutoff for looking ahead in this or any game.
Since I'm running out of room to display all the possible
boards at the same level, let's make life easy for me and set
our arbitrary cutoff for looking ahead at two levels or two
moves ahead.
The static board evaluation function
Above it was noted that two of those bottom-level boards were
wins for white. But what about those other boards? What do
they indicate for us? Will they lead to wins or losses for
us? How can we estimate that? How can we get a computer to
estimate that?
Providing that estimate is the job of something called the
"static board evaluation function". A static board
evaluation function takes as input the current state of a
game (i.e., the board) and returns a value corresponding to
the "goodness" of that current state or board. By "goodness"
we mean how close that board is to a victory for us---the
closer, the better. A simple static board evaluation
function might return, say, a positive number if the board is
good for us, a negative number if the board is not good for
us (but is consequently good for our opponent), and maybe a
zero if the board is neither bad nor good for either player.
How might we design such a function? Here's a weak first
attempt at one. Since we're playing on the black side of the
board, we'll have the function return a +10 if the board is
such that black wins. And we'll have it return a -10 if
white wins. (If we were playing on the white side of the
board, we'd want it to return a +10 if white won, and a -10
if black won.) Since we win if we can get one of our pawns
across the board to the other side, we should have the
function take that into account too. So if neither side has
won, let's have our function return the number of black pawns
with clear paths in front of them minus the number of white
pawns in front of them. Oh, and since we win if our
opponent's pawns are all removed from the board, let's have
the function incorporate that. We'll have the function count
the number of black pawns on the board, subtract the number
of white pawns, add that number to the previous number, and
return that result. There, that wasn't so bad, was it?
Next week, we'll see how you use the information generated
by your static board evaluation function.
Copyright 1998 by Kurt Eiselt. All rights reserved.
Last revised: May 7, 1998