CS 2360 - October 27, 1998

Lecture 10 -- Representation in Networks


Representing trees as A-lists

Last time we looked at representations of tree-like data structures.
Here's another way to represent trees that we didn't mention in 
the previous class or today.  This is the binary tree from Thursday's
lecture:

(a (b (d nil nil) (e nil nil)) (c (f nil nil) (g nil nil)))

And here is that same tree encoded with an A-list:

((a (b c))     ;; a has a left child b and a right child c
 (b (d e))     ;; b has a left child d and a right child e
 (c (f g))     ;; etc.
 (d (nil nil)) ;; d has no left child and no right child
 (e (nil nil)) ;; etc.
 (f (nil nil))
 (g (nil nil)))

Do we need (d (nil nil)) etc. in this list? The answer is that it 
doesn't matter as long as we abstract the problem sufficiently.  As 
long  as there is some code which implements "get-left-subtree" and
"get-right-subtree" it is not important to the higher layers of
the software whether we put the (d (nil nil)) in the list or not. 
We just deal with the get child functions and depend on them
to do their job. 

Can we adapt the preorder tree traversal code from last time to
traverse this binary tree embedded in an A-list?  You bet.  Here's
the code from last time:

(defun print-preorder (tree)
  (cond ((null (get-root tree)) nil)
        (t (print (get-root tree))
           (print-preorder (get-left-subtree tree))
           (print-preorder (get-right-subtree tree)))))

(defun get-root (tree)
  (first tree))

(defun get-left-subtree (tree)
  (second tree))

(defun get-right-subtree (tree)
  (third tree))

We're going to have to tweak all of this code a little bit, because
of the use of the A-list, but we don't have to tweak it much.  Note
that in the nested list representation:

(a (b (d nil nil) (e nil nil)) (c (f nil nil) (g nil nil)))

the root node of the tree is defined as being the first element
of a three-element list.  So whatever is root is defined by its
position in the list, right?  But this assumption doesn't hold
true for the association list.  That is, the root of the tree
in this structure:

((a (b c))
 (b (d e))
 (c (f g))
 (d (nil nil))
 (e (nil nil))
 (f (nil nil))
 (g (nil nil)))

isn't necessarily the first thing in that A-list.  I could change
things around like this:

((f (nil nil))
 (b (d e))
 (e (nil nil))
 (a (b c))
 (d (nil nil))
 (c (f g))
 (g (nil nil)))

and my binary tree hasn't changed.  (If you don't believe it, draw
a little tree diagram for with circles and arrows for both A-lists
and see for yourself that they're the same.)

The upshot of this observation is that to get things started, I have
to tell my preorder function not only what the tree looks like, but
what the root node is.  It won't be able to figure that out for itself.
So I'm going to have to add an extra parameter to accommodate root 
information to go along with tree information.  So everywhere I passed
"tree" before, I should somehow pass "root" too.  Got it so far?
Nah, I didn't think so.  So let me take you through the top-level
procedure and show you the hows and whys of the changes:

First I add the "root" arg to the input argument list so I can
tell the procedure which node is the root of the tree:

(defun print-preorder-alist (tree root) 

Without worrying about the details of how "get-root" is implemented,
I figure I should pass the root info via the argument list to
"get-root".  As we'll see later, this will prove to be superfluous
effort:

  (cond ((null (get-root tree root)) nil)

Again I add the root argument in the call to "get-root":

        (t (print (get-root tree root))

Now we get to the hard parts.  They're not really that hard once
you get comfortable with the a-list thing.  First off, I know
that I want to pass the root info to "get-left-subtree" and
"get-right-subtree", so that's pretty easy.  But in the previous
(non a-list) version of this procedure, "get-left-subtree" and
"get-right-subtree" returned entire sub-chunks of the tree, right?
Would that be easy to do using an a-list, given that the order
of things in the a-list doesn't mean what it used to?  No.
So instead of returning whole subtrees, what should those 
functions return?  If you said "the root of the subtree" then
you get it.  If you said something else, then go back and keep
rereading this until "the root of the subtree" makes sense
as the right answer.  So I modify the calls to "get-left-subtree"
and "get-right-subtree" as noted previously.  If those function
calls return the root of the subtree, then I want to pass the
result of those function calls as the root argument to my
"print-preorder-alist" function, and I also have to pass the tree
itself, so the next two expressions end up looking like this:

           (print-preorder-alist tree (get-left-subtree tree root))
           (print-preorder-alist tree (get-right-subtree tree root)))))

So now the "print-preorder-alist" function looks like this:

(defun print-preorder-alist (tree root)
  (cond ((null (get-root tree root)) nil)
        (t (print (get-root tree root))
           (print-preorder-alist tree (get-left-subtree tree root))
           (print-preorder-alist tree (get-right-subtree tree root)))))

What about the details of accessing the data structure?  Well, as we
hinted at above, if you pass the root to "get-root", "get-root" doesn't
have to do much of anything but return what it was passed:

(defun get-root (tree root)
  root)

Then to get the roots of the left and right subtrees, you use the
magic function "assoc" and then apply the correct combination of
"first"s and "rest"s to get at the information you need.  (You
may want to trace the behavior of all this stuff by hand if you
don't understand why the two functions below look the way they do.):

(defun get-left-subtree (tree root)
  (first (first (rest (assoc root tree)))))

(defun get-right-subtree (tree root)
  (second (first (rest (assoc root tree)))))

If you pass our new "print-preorder-alist" function the tiny tree
that we implemented as an a-list along with the symbol 'a which
is the root of this tree, you'll see a nice preorder tree traversal,
assuming I didn't introduce any bugs into this program.  Why is
it so important to have this ability to traverse a binary tree
that's implemented as an a-list?  That little exercise above
gives us a clue as to how to implement an incredibly useful
if somewhat disorganized (relatively speaking) data structure
called the "relational network".


Another abstraction exercise

Does the a-list above have to look like this?:

((a (b c))
 (b (d e))
 (c (f g))
 (d (nil nil))
 (e (nil nil))
 (f (nil nil))
 (g (nil nil))))

No.  It could just as easily look like this:

((a b c)
 (b d e)
 (c f g)
 (d nil nil)
 (e nil nil)
 (f nil nil)
 (g nil nil)))

That was just a choice on my part.  No big deal, since my clever
use of data abstraction allows me to isolate and change the
appropriate functions as the details of my data structure change:

(defun get-left-subtree (tree root)
  (second (assoc root tree)))

(defun get-right-subtree (tree root)
  (third (assoc root tree)))

Isn't abstraction wonderful?


Welcome to the relational network

We've been looking at tree-like data structures a lot lately, 
and by now you're probably wondering what is the big 
fascination with these things.  The reason we get all tingly 
about trees, or any sort of hierarchical data structure, is 
that they're great ways to organize knowledge.  Traditional 
linear structures, like files with lots of records, or even 
to some extent simple linear linked lists like those we've 
been using in this class, make it difficult to represent the 
wide variety of relationships which exist between entities in 
the world.  Trees get us a step closer to representing the
complexities of relationships between things in the real 
world because they get us closer to that notion of a relational
network mentioned above.  Many times when we're using computers, 
we're really trying to build computational models of some aspect of 
the real world, and structures like trees and networks help
us make more useful models.


Networks in the dictionary

We see hierarchical organizations in the real world all the 
time.  They may not be "pure" hierarchies, but they're 
hierarchical in spirit at least.  It might be easier to think 
of these things as "networks" instead of hierarchies.  Take 
for example the common dictionary.  At first glance, it looks 
like a very linear organization of the words in our language.  
But what a dictionary really specifies is a very complex and 
somewhat hierarchical map of the relationships between the 
words in our language.  Here are some sample definitions:

dog:        any of a large and varied group of domesticated
            animals related to the fox, wolf, and jackal

chihuahua:  any of an ancient Mexican breed of very small dog
            with large, pointed ears

bird:       any of a class of warm-blooded, two-legged, 
            egg-laying vertebrates with feathers and wings

penguin:    any of an order of flight-less birds found in the
            Southern Hemisphere, having webbed feet and 
            paddle-like flippers for swimming and diving

ostrich:    a large, swift-running bird of Africa and the 
            Near East; the largest and most powerful of 
            living birds; it has a long neck, long legs, two
            toes on each foot, and small useless wings

canary:     a small yellow songbird of the finch family, 
            native to the Canary Islands

Notice that these definitions all relate the thing being 
defined to some larger class of things, and then goes on to 
try to distinguish that thing from other similar things.  
Note also that as the things being described stray further 
and further from what we might think of as being norms or 
stereotypes, the definitions get longer and more detailed.  
For example, compare the canary (a stereotypical bird) to an 
ostrich (an extremely non-stereotypical bird).

When we take the time to look at the dictionary in this way, 
we uncover what is essentially a bunch of pointers from one 
word to others.  Because I'm trying to prove a point here, 
I've focused on the animal kingdom, knowing that zoologists 
spend a lot of time building these "taxonomies", or 
hierarchies of what is related to what.  But it works for 
things other than animals:

chair:      a piece of furniture for one person to sit on,
            having a back and, usually, four legs.

And so on.  You can look up more if you like.  Try "couch", 
"sofa", "table", "ottoman", and whatever.  No matter what 
noun you look up, you'll find the same sort of pattern: 
relate this word to a larger class of things, then describe 
some features to distinguish this thing from similar sorts of 
things.  Of course, I've purposely avoided dealing with 
another big class of words here--a class you know as "verbs".  
We'll save that topic for CS 4344, the course on natural 
language understanding.  

The dictionary writers don't always help as much as we might 
like, however:

rock:       a large mass of stone

stone:      the hard, solid, nonmetallic mineral matter that
            rock is composed of

Except for the extra bit of information that stone is mineral 
matter, all we know here is that rock is made of stone, and 
stone is what makes up rock.  Sigh.

In any case, we can use our high-level data abstraction, the 
directed graph, to make these relationships a bit more visual.  
For example, from the bird definitions, we can construct the 
following abstraction:

                   vertebrate
                        ^
                        | is-a
                        |
                        |          has-part
                        |        /------------- wings
                        |       /  reproduction
                        |      /--------------- egg-laying
                        |     /    body-temp
                        |    /----------------- warm-blooded
                      bird--<      no. of legs
                      ^ ^ ^  \----------------- 2
                     /  |  \  \    covering
               is-a /   |   \  \--------------- feathers
                   /    |    \  \  movement
       color      /     |     \  \------------- flight
yellow ------canary     |      \
       size  /          | is-a  \ is-a
 small -----/           |        \       movement
                        |        ostrich---------- run
            movement    |              \  size
      swim ----------penguin            \--------- big

OK, so I fudged this a little.  I had to infer that the fact 
that birds have wings means that they move around by flying; 
the dictionary writers didn't tell us that.  And I had to 
infer that the fact that birds were egg-laying told us 
something about their reproductive processes, and so on.  But 
you get the idea, no?


Networks in your head

These sorts of knowledge hierarchies show up elsewhere.  
Independent of this dictionary organization, psychologists in 
the 1960s theorized that humans organize at least some of 
what they know in a similar hierarchical fashion.  For 
example, they said, a person's knowledge of things in the 
world might be organized along these lines:

                                all things
                                 /      \
                                /        \
                               /          \
                              /            \
                physical objects          abstract objects
                  /           \                /     \
                 /             \              /       \
                /               \           time   thought
               /                 \
       inanimate                 animate
        objects                  objects
         /   \                   /     \
        /     \                 /       \
       /       \               /         \
  inorganics   plants    mammals         birds
     /   \               /     \         /    \
    /     \             /       \       /      \
  rock   car          dog      human  canary  ostrich

  note:  assume that all links point upward


This hierarchy is by no means complete, nor is it exactly 
what the psychologists, Collins and Quillian, had proposed, 
but it's sufficient for our purposes.

If we think of all the upward links as being relationships of 
the form, "the thing below is a subtype of the thing above," 
then we have something called a "type hierarchy".  And in 
fact, we could put the label "S" on all those upward links, 
to indicate that the thing below is a "S"ubtype of the thing 
above.  Folks in the world of artificial intelligence don't 
use the "S" or "subtype" terminology very much; instead, AI 
folks use the label "is-a" in hierarchies like this, as in "a 
canary is-a bird".  So these hierarchies can also be called 
"is-a hierarchies".  Note that we went ahead and used that 
"is-a" label in the first diagram above.

In any case, Collins and Quillian backed up their theory with 
experiments based on this premise:  If people really store 
knowledge in hierarchical form, then if they're asked the 
right questions, we should note significant differences in 
the time it takes those people to respond correctly.  For 
example, the time to answer "yes" to "Is a canary a bird?" 
should be less than the time to answer "yes" to "Is a canary 
an animate object?"

The experiments did in fact generate the right numbers, and 
for awhile everyone thought the question of how human memory 
is organized had been answered.  But other experimenters had 
difficulty replicating these results, so there was some 
controversy about just how Collins and Quillian obtained 
their results.  Nevertheless, hierarchical models of human 
memory are still very popular, although they are considerably 
different in their organization than the one we've just 
looked at.


Why do the arrows point up?

Well now, that's an interesting question, no?  The reason, at 
least from either a psychological or AI point of view, is that 
humans typically are better at answering questions like "Is a 
dog a mammal?" than questions like "Name all the mammals you know."  
In other words, people are better at recognition than recall 
or retrieval.  The upward arrows in our diagrams suggest that 
it would be easier to start at the "dog" node and traverse 
the link up to the "mammal" node to answer "Is a dog a 
mammal?", than it would be to start at the "mammal" node and 
try to traverse all the downward links in an effort to 
enumerate all the different types of mammals.


Inheritance

We can get more utility out of our hierarchies if we add 
important and distinguishing properties (or features or 
attributes, all of which are indicated by the links that tend 
to go horizontally rather than vertically), like we did in 
the dictionary example:

                   vertebrate
                        ^
                        | is-a
                        |
                        |          has-part
                        |        /------------- wings
                        |       /  reproduction
                        |      /--------------- egg-laying
                        |     /    body-temp
                        |    /----------------- warm-blooded
                      bird--<      no. of legs
                      ^ ^ ^  \----------------- 2
                     /  |  \  \    covering
               is-a /   |   \  \--------------- feathers
                   /    |    \  \  movement
       color      /     |     \  \------------- flight
yellow ------canary     |      \
       size  /          | is-a  \ is-a
 small -----/           |        \       movement
                        |        ostrich---------- run
            movement    |              \  size
      swim ----------penguin            \--------- big

If we then allow what is called "inheritance" of these 
features or attributes, we get a big win.  Inheritance means 
that one type inherits or takes on the properties of its 
supertypes, assuming that there's no information to the 
contrary.  So, for example, we know that a canary's primary 
mode of movement is by flight, even though we don't see that 
explicitly represented as a property of canaries, because we 
can see that a bird (the supertype of canary) moves by 
flight.  The canary subtype inherits the property of flight 
from the bird supertype.  If we didn't allow inheritance in 
networks like this, we'd have to attach the property of 
movement by flight to every appropriate node in the network.  
Not only that, but we'd have to repeat every specific 
property everywhere that we wanted it in the network, and 
that would cost us a humongous amount of storage space.  So 
inheritance buys us economy of representation, although any 
program that takes advantage of inheritance is going to have 
to do some extra work to search around the network and find 
out which properties are supposed to be inherited.

We can also make exceptions, and say that a penguin moves 
primarily by swimming, even though it's a bird.  We add that 
property explicitly at the "penguin" node, and it overrides 
the default property of movement by flight at the "bird" 
node.  So, in an "inheritance hierarchy" such as this, 
properties are only passed from supertype to subtype when 
there's no explicit information to the contrary stored with 
the subtype.

Have you seen anything resembling inheritance hierarchies 
before?  Well, if you've taken CS 2390, or you know anything 
about object-oriented programming, it should be obvious to 
you that all we've done here is define a set of data objects.  
Yes, it's true, long before there was Smalltalk, C++, or even 
CLOS (Common LISP Object System), cognitive psychologists of 
the 60s, and their counterparts in the land of artificial 
intelligence, were laying out the foundations of what would 
eventually become object-oriented programming.  


How to represent your networks in LISP

Hey, it's simple.  Use an association list.  You remember a-
lists, right?  (Hint: we mentioned them at the beginning of
this set of notes.)

(defun *database* ()
  '((canary  (is-a bird)
             (color yellow)
             (size small))
    (penguin (is-a bird)
             (movement swim))
    (bird    (is-a vertebrate)
             (has-part wings)
             (reproduction egg-laying)
                :
                :   )
   )
)

You'd use the "assoc" function with a key of "canary" to 
extract all the information about the "canary" type.  Take 
the "rest" of that result to give you a list of just the 
links paired with what's at the end of those links.  Then use 
the "assoc" function with a key of "is-a" to find out which 
node is the supertype or parent of "canary".  Any pair that 
doesn't start with "is-a" is just a property explicitly 
represented at that node, which could be inherited by any 
subtype below that node.  Nothing to it.  In fact, you might
even be able to adapt some of that code from way back at
the beginning of this set of notes that showed you how
to traverse a binary list that was implemented as an a-list.


Networks and relational databases

Everything we've shown you so far has been purely tree-like 
in form, but as we've said, that's clearly not necessarily 
going to be true.  In fact, it's much more likely that the 
organization in these structures will be much more 
convoluted.  Consider some of the relationships which may 
exist in a small company that makes cough drops:


          Smith     options
         Brothers ---------- pay plans
            |\                /     \
            | \              /       \
            |  \      salaried     hourly
            |   \      /           /
       dept.|    ---- /           /
            |        \           /
            |  pay  / ----      /pay
            |  plan/      \    /plan
            |     /   dept.\  /
            |    /          \/
       engineering      shipping
          /   \           /   \
         /     \         /     \
        /       \       /       \
     Arnie    Brian   Chuck    David
     Smith    Smith   Smith    Smith

Ugh.  Anyway, welcome to the exciting world of relational 
databases.  Just like object-oriented programming, relational 
database work is something that evolved from artificial 
intelligence ideas about how to organize knowledge (although 
you'll never get a relational database person to admit this), 
which in turn evolved from ideas in cognitive psychology.


Networks on TV (not ABC, CBS, or NBC...or even Fox, UPN, or the WB)

Network abstractions have even been used by popular 
publications to explain what's going on between the 
characters in television shows.  For example, at the height 
of the popularity of the show "Twin Peaks" several years 
ago, both People and Newsweek published very detailed network 
representations of the relationships between the many 
inhabitants of the town of Twin Peaks.  I showed you all 
reproductions of these diagrams, so I won't bother to repeat
them in ASCII here (whew!), and you could tell just by 
looking at them that these networks are far from tree-like 
(i.e., there's no obvious hierarchy, and there are most 
definitely some cycles.  Oh, by the way, here's some more 
terminology...you'll also find structures like these called 
"semantic networks" instead of "relational networks", depending
on how they're used, but you don't need to worry about that much 
unless you take CS 3361.).  But the fundamental ideas about 
organizing knowledge in terms of things and relationships 
between things are still there, as are the fundamental ideas 
about how to traverse these structures, which we'll be 
discussing real soon now.

But in summary, let's revisit the original question, "Why are 
we getting so excited about these trees and/or networks?"  As 
we've seen, the answer is that we can model so many diverse 
things with them.  In just this brief time, we've seen how we 
can model the organization of dictionaries, human memory 
(maybe), a small company, and a fictional town, all using the 
same basic nodes-and-links representation scheme.  
Furthermore, in so doing, we've shown that this common thread 
runs through cognitive psychology, artificial intelligence, 
object-oriented programming, and relational databases, just 
to name a few areas of academic endeavor.  (Not to mention 
the World Wide Web itself, the infamous Six Degrees of Kevin 
Bacon, and Newsweek's recent wild and wacky world of Kenneth
Starr.)  See, there really is some method to the madness.  
Trust me.


A brief re-introduction to search

Now that we have all this new knowledge about representation 
in trees, hierarchical structures, networks, and the like, we 
need some means for exploring these knowledge structures to 
get at the information we want at the time we want it.  How 
do we do this?  The answer is a bunch of techniques which 
collectively fall under the heading of "search".  Search is a 
concept which permeates computer science.  We'll only touch 
on a couple of kinds of search in this course, but they'll be 
sufficient to demonstrate the basic difference between brute-
force, exhaustive, or "dumb" search and heuristic or 
"intelligent" search.


Linear search

You probably already know how to do a linear search.  You 
probably did linear searches in previous programming courses.  
For example, starting at the beginning of a file structure 
and looking at record after record for a specific entry is a 
linear search.  (If you've ever seen my office, you know that 
the only way I could find something in there is by linear 
search:  I start at one end of the desk and look at 
everything until I find what I'm looking for.)  Linear 
searches take a long time -- O(n), that kind of time.  
(Actually, assuming an even distribution of stuff in the 
file, you're looking at 1/2 * O(n), but the constants are 
more or less unimportant.)

We can impose a separate indexing scheme on our file 
structure, so that we can cut down on some search time.  For 
example, we could apply a binary search mechanism to look for 
an employee record in a file.  If the employee's name starts 
with a letter in the range A-M, we could start the search at 
the beginning of the file, but if the name starts with the 
letter N-Z, we would start the search at approximately the 
midway point in the file.  We could continue to divide the 
big groups into smaller groups, until eventually the time to 
find a single record is governed not by the behavior of the 
linear search but by the behavior of the binary search.  
There are other indexing mechanisms that we could use, such 
as hashing functions, that would give us different kinds of 
advantages.


Searching a hierarchical structure

As we discussed previously, we don't always store our stuff 
in linear formats.  We can also organize knowledge in hierarchies.  
Consider, for example, the Flintstone Family Tree:


                    Chip       Roxy
                      | (twins) |
                      ___________
                           ^
                          / \
                         /   \
                        /     \
               has-mom /       \ has-dad
                      /         \
                    \/_         _\/
                 Pebbles       Bam-Bam
                 / \ has-dad       / \
        has-mom /   \     has-mom /   \ has-dad
               /     \           /     \
             \/_     _\/       \/_     _\/
            Wilma    Fred     Betty   Barney

In structures like this, as before, we may want to search for 
useful information.  But structures like this, unlike linear 
file structures, make it easier to search for the answers to 
questions like "What's the relationship of Barney to Chip?" 
or "Who is Chip's grandfather on his mother's side?"


Depth-first search

The simplest form of search in a hierarchical or network 
structure is called "depth-first search".  Here's an 
algorithm for depth-first search on a binary tree, looking 
for a specific node in the tree:

df-search

1.  look at the root
2.  if it's what you're looking for, then return success
3.  if the root has no descendants, then return failure
4.  call df-search on the subtree whose root is the leftmost
    descendant and return success if that search is 
    successful
5.  call df-search on the subtree whose root is the rightmost
    descendant and return success if that search is 
    successful

This algorithm may look somewhat familiar, since it's just 
a variant of the preorder tree traversal algorithm some of
you have seen in previous courses:

preorder

1.  visit the root
2.  call preorder on the left subtree
3.  call preorder on the right subtree

The big differences between the preorder algorithm and the 
depth-first search algorithm are these:

1.  depth-first search stops before searching the whole tree,
    if it finds what it's looking for; preorder traversal
    always examines the entire tree

2.  with depth-first search, searching the right subtree
    occurs only if the search of the left subtree failed to
    find what was being looked for; with preorder traversal,
    the right subtree is always explored (this is sort of a 
    corollary to the first difference listed just above)

How do you implement this in what is quickly becoming your
favorite programming language?  Keep reading.


Implementing depth-first search

Above we talked about the differences between depth-first
search of a binary tree and preorder traversal of that same tree.
These differences make implementation of depth-first search 
more complicated than preorder traversal, but not drastically 
so.  Here's a simple depth-first search implementation for 
the Flintstone Family Tree, using a representation format 
for trees that we've used occasionally before (but this isn't
the only representation that we could have used).  The tree 
looks like this in LISP (and note that just to make things simpler,
we've eliminated one of the twins):

  '(chip (pebbles (wilma nil nil) (fred nil nil))
         (bam-bam (betty nil nil) (barney nil nil)))

And the LISP code itself looks like this:

(defun dfs (item tree)
  (cond ((done? tree) nil)
        ((found-item? item (get-root tree)) item)
        (T (or (dfs item (get-left-subtree tree))
               (dfs item (get-right-subtree tree))))))

(defun done? (tree)
  (null tree))

(defun found-item? (item tree)
  (eql item tree))

(defun get-root (tree)
  (first tree))

(defun get-left-subtree (tree)
  (second tree))

(defun get-right-subtree (tree)
  (third tree))

I've abstracted away the details of accessing the list data
structure that represents the family tree, leaving only 
a high-level algorithm description in the main function.  In
fact, the only primitive LISP functions used to describe the
high-level algorithm are "defun", "cond", and "or".

The use of "or" in the "dfs" function is an easy way to 
fulfill the requirement that the right subtree isn't searched 
if what we're looking for is found in the left subtree.  It's not 
an especially obvious use of "or", which is typically used as 
a Boolean predicate, not as program control mechanism.  Also, 
this use of "or" takes advantage of an implicit implementation 
detail (i.e., that "or" evaluates its arguments left to right, 
and stops as soon as it finds an argument which evaluates to a 
non-nil value), which also is not necessarily a great thing 
to do.  Since the left-to-right evaluation of arguments is part
of the Common LISP specification for "or", we can count on
things happening the way we expect them to here, but it's not
inconceivable that some LISP system might implement an "or"
that evaluates arguments right-to-left, with results that might
confuse us a little bit (although in a purely functional world,
the results wouldn't be catastrophic).  Furthermore, this assumes 
that any given node has a fixed number of children; if you want to 
cope with a variable number of children at any node, you might want 
to code up a slightly different version of this anyway.  For now, 
we'll leave the "or" there, but feel free to do something better.


Getting past yes or no

(We ran out of time before we could cover this in class today;
this should help you a bit when you tackle your homework.)
Sadly, the search function described in the previous chunk of
notes above doesn't tell me much---just whether or not an item 
I'm looking for is in the tree.  I'd get more information 
if I could get the search function to tell me how to 
get from the root of the tree to the item I'm looking for, 
assuming the item I'm looking for is in the tree.  
That path from the root to the item would at least be 
an approximation of the relationship between those two nodes 
in the tree; in the case of the Flintstones, for example, the 
path "Chip -has-dad-> Bam-Bam -has-dad-> Barney" tells me 
something about the relationship between Chip and Barney.  
How can I get my depth-first search procedure to 
return this path, instead of just the item itself, when it 
finds the item in the tree?  It's pretty easy.  All you do is 
introduce an additional argument as a sort of variable to 
store the path from the root to wherever the procedure is 
looking in the tree.  You get that additional argument by 
adding a helping function, just like in many of those examples
of tail recursion.  (But note that the addition of the helper
function does not make this depth-first search tail recursive.)
Then it's just a question of building up the result as the 
procedure searches deeper in the tree:

(defun dfs (item tree)
  (dfs-helper item tree nil))

(defun dfs-helper (item tree result)
  (cond ((done? tree) nil)
        ((found-item? item (get-root tree)) 
         (cons item result))
        (T (or (dfs-helper item 
                           (get-left-subtree tree)
                           (cons (get-root tree) result))
               (dfs-helper item 
                           (get-right-subtree tree)
                           (cons (get-root tree) result))))))

(defun done? (tree)
  (null tree))

(defun found-item? (item tree)
  (eql item tree))

(defun get-root (tree)
  (first tree))

(defun get-left-subtree (tree)
  (second tree))

(defun get-right-subtree (tree)
  (third tree))

And note that because I've taken the time to do a great deal 
of data abstraction, separating the functions that access the 
LISP data structure from the higher-level algorithm, that all 
I had to do was make a few changes to the top-level 
procedure; the lower-level ones are untouched because we 
didn't make any changes to the LISP data structure.

Again I ask you, isn't abstraction wonderful?



Copyright 1998 by Kurt Eiselt.  All rights reserved.

Last revised: October 27, 1998