CS 2360 - April 4, 1996

Lecture 4 -- Program Control


The conditional

A programming language really isn't worth much if there's no 
way to change the flow of program control.  Being able to 
take different branches depending on the results of some test 
is what makes computer programs useful.  The basic mechanism 
for doing this is called the "conditional", and in LISP the 
fundamental conditional is called "cond".  Here's the syntax:

(cond (*test1* *action1*)
      (*test2* *action2*)
          :
          :
      (*testN* *actionN*))

If the expression *test1* evaluates to non-nil, then the 
"cond" function returns what the expression *action1* 
evaluates to.  (Since *test1* is an expression, we'd expect 
to see a function call there, or maybe a symbol bound to some 
value...that sort of thing.)  If *test1* evaluates to nil, 
the "cond" skips to *test2*, which is evaluated as above, 
and so on.  Each test-action pair is called a "cond clause".

If all the tests are evaluated in sequence, and all tests 
evaluate to nil, then the "cond" returns nil.  While you can 
count on this to happen, it may not be immediately obvious to 
other folks who read the "cond" expression exactly what the 
original programmer intended to occur in this case.  Good 
programming style in general demands that you make your 
intentions explicit in your code.  Here, that means you 
should always end your "cond" with a cond clause which makes 
it obvious what you expect to happen when all the previous 
tests evaluate to nil.  You do it like this:

(cond (*test1* *action1*)
      (*test2* *action2*)
          :
          :
      (*testN* *actionN*)
      (T *what you want to happen if all else fails*))

Also, you can have more than one action in each cond clause.  
If the test is non-nil, the associated actions will be 
evaluated left-to-right, and the last expression evaluated 
will be the one returned by the "cond" function.  (Note, 
though, that since we're not letting you create any side-
effects by assigning values to variables yet, this feature 
won't be all that useful to you just now.)  

Predicates

Common LISP provides a set of functions which are designed to 
execute useful tests and Boolean or true/false values 
depending on the outcome of the test.  These are called 
predicates, and we use the all the time as the tests in our 
"cond" functions.  Here are some commonly-used predicates:

  (null *expr*)      returns non-nil if *expr* is the empty
                     list, nil if *expr* is not empty

  (atom *expr*)      returns non-nil if *expr* is an atom,
                     nil if *expr* is not an atom

  (numberp *expr*)   returns non-nil if *expr* is a number,
                     nil if *expr* is not a number

  (listp *expr*)     returns non-nil if *expr* is a list,
                     nil if *expr* is not a list

Historically, many functions designed to work as predicates 
(i.e., returning true/false values) have had the letter "p" 
appended to their names, hence "numberp" and "listp".  
Obviously, folks haven't been too consistent in this, since 
"atom" is not "atomp".  It's quaint idiosyncrasies like this 
that give any language some personality, no?  Sometimes, this 
sort of stuff filters into everyday language use.  For 
example, one LISP hacker might ask if another is interested 
in going to lunch by saying simply "lunchp?"....I guess you 
had to be there.

Equality predicates

There are several equality predicates worth knowing about.  
In "A Programmer's Guide to Common LISP", Deborah G. Tatar 
explains it pretty well (pp. 48-50):

"...there are four important general tests for equality.  
These tests take any two LISP objects as arguments, and check 
to see if they are equal.  Naturally, two objects must be of 
the same type to be equal.

You might wonder why four tests are necessary.  Why doesn't 
one test serve the purpose?  The reason is that there are 
degrees of equality.  Most of the time you want to know 
whether two objects look the same, but sometimes you have to 
know whether they are actually the same object in memory.  
That accounts for two of the tests.  Then, as it turns out, 
minor modifications on each of the major tests make two more 
surprisingly useful functions.

EQUALP and EQUAL are the more general equality predicates.  A 
good rule of thumb is that two objects are EQUALP or EQUAL if 
they look the same when they are printed on the screen.

:
:

The difference between EQUAL and EQUALP is that EQUALP is 
less pure in its definition of equality.  Simply because it 
turns out to be useful, EQUALP ignores differences in case in 
characters and type in numbers.  For example,

(equal 3 3.0)
NIL

but

(equalp 3 3.0)
T

Also,

(equal "YES" "yes")
NIL

(equalp "YES" "yes")
T

The last example demonstrates one of the instances in which 
EQUALP is useful; if you had solicited user input, you 
probably wouldn't care whether it was typed in lower-, or 
uppercase letters, or both.

The other two equality predicates, EQ and EQL, tell you 
whether you are looking at two objects in memory or at one.  
Why do we need operators like these?  Consider the following 
calls and returned values:

(equal (cons 'a 'b) (cons 'a 'b))
T

(equalp (cons 'a 'b) (cons 'a 'b))
T

These might look like good answers, and for many purposes 
they are; however, consider that CONS is a function that 
performs an operation.  Each time you call CONS, a new cons 
cell is constructed.  The contents of two cons cells may be 
the same or look the same but they are separate objects, just 
as twins who have DNA with the same sequence of nucleotides 
are still separate persons.  EQ and EQL test whether two 
objects not only look alike, but whether they are the same, 
that is, located in the same place in memory.  In other 
words,

(eq (cons 'a 'b) (cons 'a 'b))
NIL

(eql (cons 'a 'b) (cons 'a 'b))
NIL

This kind of test is important when you have the ability to 
change objects.  Then you often need to know whether both 
items will change, or only one.

:
:

One characteristic difference between EQ and EQL has to do 
with the way LISP handles numbers.  EQ returns true only if 
two numbers are in exactly the same location in memory.  
Small numbers (called FIXNUMS) have a direct representation 
in memory, and are always EQ.  However, LISP must create a 
representation for very large numbers (BIGNUMS) and for 
floating-point numbers each time they are used.  Therefore, 
they may not be EQ.  It turns out that much of the time you 
won't care about exact identity in that case.  Furthermore, 
the number of fixnums is implementation-dependent.  EQL is 
provided as a portable version of EQ.  For example, in a 
given implementation of LISP:

(eq 1234567890 1234567890)

may return T or NIL, but:

(eql 1234567890 1234567890)

always returns T.

The difference between EQ and EQL is rather subtle; in fact, 
the only reason for introducing EQL at this early stage is 
that it is the default test that LISP functions use to test 
for equality."

Using "cond" -- an example

Let's say we want to define a function which tells us if a 
given item is an element of a given list.  This turns out to 
be a very useful function, and it already exists in Common 
LISP.  It's called "member".  But even though it already 
exists, we want the practice, so we're going to construct our 
own version.  And to make sure we don't inadvertently replace 
LISP's version with our own possibly buggy version, we'll 
give ours a distinctive name.  We'll call it "my-member".

What will the design look like?  We can sketch it out with a 
combination of the LISP syntax we already know, and some 
English where we're not sure about the LISP yet.  Here's the 
first cut:

  (defun my-member (input-item input-list)
         if done then return "no"
    else if input-item = first element of input-list
         then return "yes"
    else what?  see if input-item = next thing on input-list?
         how? )

OK, so how are we going to turn all that "if-then-else" stuff into a "cond"?

  (defun my-member (input-item input-list)
    (cond (done then return "no")
          (input-item = first element of input-list
           then return "yes")
          (what?  see if input-item = next thing on 
           input-list? how? ) ) )

Hmmm.  That looks a little more like LISP, but it sure won't 
run on my Macintosh.  What looks like something that's going 
to be real easy to turn into LISP?  How about that test to 
see if input-item is the same as the first element of input-
list?  That should be easy.  Just remember the "cond" syntax:

  (defun my-member (input-item input-list)
    (cond (done then return "no")
          ((eql input-item (first input-list))
           then return "yes")
          (what?  see if input-item = next thing on 
           input-list? how? ) ) )

And how do we return "yes" in that case?

  (defun my-member (input-item input-list)
    (cond (done then return "no")
          ((eql input-item (first input-list)) T)
          (what?  see if input-item = next thing on 
           input-list? how? ) ) )

Nothing to it.  How are we going to test if we're done?  
Well, if we just sort of walk along input-list, testing the 
individual elements to see if they match input-item, what 
would be the termination point?  When we run out of input-
list, or, in other words, when input-list is nil.  So now we 
can translate more English into LISP:

  (defun my-member (input-item input-list)
    (cond ((null input-list) nil)
          ((eql input-item (first input-list)) T)
          (what?  see if input-item = next thing on 
           input-list? how? ) ) )

Wow.  Now I have more LISP than English.  But there's still 
one missing chunk.  How do I get this thing to repeat for 
every element of input-list (or at least until I match input-
item)?  If we were piddling around with Pascal, we'd want to 
create some sort of loop structure, and maybe create a 
variable or two, and throw in an assignment operation here 
and there...make it really complicated, and in the process 
make ourselves feel good about how much mastery we have over 
our computer.  Grrrrr.

Well, that's not gonna happen here.  Not today at least.  
We're going to use a very elegant and computationally pure 
form of iteration which LISP supports very nicely.  It's 
called recursion.

Recursion

"Recursion" essentially means defining something in terms of 
itself.  A function is recursive if it (directly or 
indirectly) calls itself.  A recursive function consists of 
three parts:

1)  the termination condition, or when to stop
2)  the operation or modification, or what to do to the input
    to move closer to a termination condition
3)  the recursive call itself.

Recursion is a program control mechanism that allows 
repetitive operations without traditional iteration, which 
requires the use of side effects and the maintenance of 
variables as counters or temporary storage places...things 
which add unnecessary complexity.  Using recursion 
effectively requires a different style of thinking, but 
you'll get better at it with practice if you find it 
difficult early on.  Recursion also results in nice, clean, 
compact source code which is often easier to read than the 
iterative equivalents.  A recursive function can also eat up 
lots of memory as it is running; we'll see more of this 
later.

Let's go back now and finish "my-member".  What do we want to 
do?  With "my-member", we're trying to build a function which 
does some operation on all the elements of a list, until we 
find a specific element.  If we're thinking recursively, we 
want to break this up into a couple of smaller problems 
(there's that abstraction thing again):

1)  performing that operation of one element of the list,
    combined somehow with...

2)  calling the function just defined on the remainder of the
    list

So let's apply all this thinking about recursion to "my-
member".  So far, we've already coded two different 
termination conditions: stopping when we get to the end of 
input-list without finding a match, and stopping when we find 
a match with input-item.  And the test to see if we find a 
match between input-item and the first element of input-list 
is effectively the "performing that operation of one element 
of the list" that we just mentioned.  But if neither of those 
conditions is true, what do we want to do?  We want to call 
"my-member" on the remainder of input-list, since that will 
get our matching operation performed on the next element of 
the list, while at the same time reducing the size of input-
list and thereby getting us closer to a termination 
condition.  The end result looks like this:

  (defun my-member (input-item input-list)
    (cond ((null input-list) nil)
          ((eql input-item (first input-list)) T)
          (T (my-member input-item (rest input-list)))))

Oh, one other thing.  When "my-member" finds a match, it 
returns T.  But when Common LISP's "member" function returns 
a match, it returns that part of input-list which begins with 
input-item.  That's also a non-nil result, so it has the same 
Boolean value, but it gives us more information than just 
"true" or "false".  You'll find that LISP tries to do that a 
lot, and you should think about doing it too when you can.  
To make "my-member" work that way, it would be changed to 
this:

  (defun my-member (input-item input-list)
    (cond ((null input-list) nil)
          ((eql input-item (first input-list)) input-list)
          (T (my-member input-item (rest input-list)))))

Analyzing the process

Let's look at another example.  In this case, the example is 
a classic when it comes to educating folks about recursion.  
It's computing the factorial of an integer.  The factorial 
function is defined in one of two ways:

n! = n * (n-1) * (n-2) * ... * 2 * 1 if n * 0
     1                               if n = 0

or

n! = n * (n-1)! if n * 0
     1          if n = 0

The first definition suggests a traditional iterative 
approach to computing the factorial, while the second one 
feels much more recursive.  So let's implement the second 
one:

  (defun factorial (n)
    (cond ((eql n 0) 1)
          (T (* n (factorial (- n 1))))))

If we apply our substitution model of evaluation, we can 
analyze what happens when we execute this program.  When we 
first invoke factorial, say on the integer 4, what goes on 
the program stack is the equivalent of this (we'll use "fact" 
instead of "factorial" to save space):

|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|(fact 4)|        |        |        |        |        |
-------------------------------------------------------

Evaluating (fact 4) results in replacing (fact 4) with the 
multiplication function and two arguments, the integer 4 and 
a function call of (fact 3):

|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |(fact 3)|        |        |        |        |
|        |   4    |        |        |        |        |
|(fact 4)|   *    |        |        |        |        |
-------------------------------------------------------

Again, what's on top of the stack gets evaluated and is 
subsequently replaced in our substitution model of 
evaluation:

|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |        |        |        |        |
|        |        |(fact 2)|        |        |        |
|        |        |   3    |        |        |        |
|        |(fact 3)|   *    |        |        |        |
|        |   4    |   4    |        |        |        |
|(fact 4)|   *    |   *    |        |        |        |
-------------------------------------------------------

This repeated substitution continues until we get to 
(fact 0), which is a termination condition evaluating to 1:

|        |        |        |        |        |        |
|        |        |        |        |(fact 0)|   1    |
|        |        |        |        |   1    |   1    |
|        |        |        |(fact 1)|   *    |   *    |
|        |        |        |   2    |   2    |   2    |
|        |        |(fact 2)|   *    |   *    |   *    |
|        |        |   3    |   3    |   3    |   3    |
|        |(fact 3)|   *    |   *    |   *    |   *    |
|        |   4    |   4    |   4    |   4    |   4    |
|(fact 4)|   *    |   *    |   *    |   *    |   *    |
-------------------------------------------------------

So we hit the termination condition of our recursion, and in 
the process of "unwinding" this recursion we return the 
values of all these stacked or postponed computations:

|        |        |        |        |        |        |
|   1    |        |        |        |        |        |
|   1    |        |        |        |        |        |
|   *    |   1    |        |        |        |        |
|   2    |   2    |        |        |        |        |
|   *    |   *    |   2    |        |        |        |
|   3    |   3    |   3    |        |        |        |
|   *    |   *    |   *    |   6    |        |        |
|   4    |   4    |   4    |   4    |        |        |
|   *    |   *    |   *    |   *    |   24   |        |
-------------------------------------------------------

Here's a slightly different notation for the substitution 
model of evaluation.  Maybe it'll give you a different 
perspective on what's happening.  It looks sort of like the 
results of using the TRACE function (which you'll learn about
in lab, if you haven't already):

(fact 4)
(* 4 (fact 3))
(* 4 (* 3 (fact 2)))
(* 4 (* 3 (* 2 (fact 1))))
(* 4 (* 3 (* 2 (* 1 (fact 0)))))
(* 4 (* 3 (* 2 (* 1 1))))
(* 4 (* 3 (* 2 1)))
(* 4 (* 3 2))
(* 4 6)
24

The interesting thing to note here is "shape" of the growth 
curve of the use of the program stack.  Each time the 
recursive call is made, we use up another chunk of program 
stack.  In fact, the use of memory grows linearly with the 
integer n in the call to factorial (i.e., memory use is O(n) 
for this algorithm).

What we see above is not just a picture of the factorial 
procedure (i.e., the code, the program, the algorithm), it's 
a look at the behavior of the "process"---the procedure in 
execution.  They are different, and it's important to be 
aware of the distinction.  In this case, we see a classic 
pattern of process behavior, and it even has a name.  Our 
recursive procedure gives rise to what's called a "linear 
recursive process".  

O(n) memory use is not exactly something to be proud of.  
We'd really like to do better than that.  "Uh?" you might be 
asking, "Is this the guy who told us not to worry about 
saving a cycle here or a byte there for the sake of 
efficiency?"  Yes, it's still me, but here we're looking at 
saving much more than a few cycles or a few bytes.  Here 
we're questioning whether we'll be able to compute factorials 
for large integers without running into limitations of 
available memory, and that's a whole different problem.  
Let's see if we can fix it.

Tail-recursion

If we go back and look at the shape of the factorial process 
above, we see that the real culprit in our excessive memory 
use is that each call to factorial gets replaced by a 
multiplication and another call to factorial.  We could stop 
this from happening if we could guarantee that each call to 
factorial is replaced only by another call to factorial (or 
something like it).  Here's how it's done:

(defun factorial (n)
  (factorial-iterative 1 1 n))

(defun factorial-iterative (product counter max-count)
  (cond ((* counter max-count) product)
        (T (factorial-iterative (* counter product)
                                (+ counter 1)
                                max-count))))

If we look at the behavior of the process that results from 
this procedure, we get an entirely different shape:

|        |               |               |               |
|(fact 4)|(fact-it 1 1 4)|(fact-it 1 2 4)|(fact-it 2 3 4)| (continued...)
----------------------------------------------------------

|               |                |    |
|(fact-it 6 4 4)|(fact-it 24 5 4)| 24 |
---------------------------------------

Or, using the other notation:

(fact 4)
(fact-it 1 1 4)
(fact-it 1 2 4)
(fact-it 2 3 4)
(fact-it 6 4 4)
(fact-it 24 5 4) 
24

Now, isn't that nice?  No growth in program stack space at 
all, and we didn't trade away an increase in execution time 
to get this either, as there are the pretty much the same 
number of function calls, multiplications, and so on.  So in 
this case, our memory usage is O(1) instead of O(n), and that 
is something to be proud of.  The shape is just a horizontal 
line, which is characteristic of what's called a "linear 
iterative process" (note that the process can be iterative, 
referring to how the shape is produced, even though the 
procedure itself is defined recursively).

To accomplish this, we used a type of recursion called "tail 
recursion".  In LISP, we obtain tail recursion by making sure 
that the step in our procedure which is the recursive step 
does not have any "work" being done "outside" the recursive 
function call.  Compare the corresponding steps from the two 
different function definitions.  Here's the first version:

          (T (* n (factorial (- n 1))))))

That "(* n " part is what was killing us in terms of memory 
usage.  This kind of recursion is called "augmenting 
recursion", meaning that there's some additional computation 
added on to the recursive call, and that makes the program 
stack grow.  In the tail recursive version, however, there's 
no augmenting recursion:  

          (T (factorial-iterative (* counter product)
                                  (+ counter 1)
                                  max-count))))

All the computation is done within the arguments to the 
function call, and so the substitutions are one-for-one and 
don't cause any growth of the program stack.

Here's a slightly different perspective which may help you
understand the difference between these two types of 
recursion.  The first version of the factorial function
postponed arithmetic operations as it progressed, and to
postpone those operations, LISP had to remember what those
operations were.  To remember all that stuff, LISP had to
use a chunk of memory on the program stack for each postponed
operation.

But when the tail-recursive version of factorial is running,
no arithmetic operations are being postponed.  All 
computations are being done as they're needed, so no 
operations must be remembered, and consequently there's
no corresponding growth in usage of the program stack.

In order to make this work, though, we had to introduce the 
equivalent of variables as place holders for these embedded 
computations, and we did this by introducing some additional 
arguments.  Those arguments in this case were established 
through the creation of a helping function, which initializes 
the arguments being used as variables:

(defun factorial (n)
  (factorial-iterative 1 1 n))

The arguments themselves take on the roles of the sorts of 
things you'd expect to see in a traditional iterative 
solution:

(defun factorial-iterative (product counter max-count)
  ...

Here, "product" is used as the place where you store your 
cumulative product, and it is initialized to the 
multiplicative identity (i.e., 1).  "counter" is just your 
index variable, and "max-count" is the limit on the number of 
"iterations" you want this thing to perform.

In class, one of you asked a really good question about this.
It went something like this:  "Isn't this cheating?  Isn't this
really iteration and not recursion?"  As we pointed out, by 
definition, this is still recursion, because the function
is defined in terms of itself.  But tail recursion is most
definitely a clever and successful attempt to reap some
of the resource savings of iterative solutions while maintaining 
design qualities of the functional programming paradigm.
We gain the advantages of iterative solutions without the
corresponding increase in complexity.  What kind of complexity?
Well, just think about this for a minute.  If I'm debugging
a recursive function that's constructed within the constraints
of the functional programming paradigm (like "factorial-iterative"), 
I know that those "variables" won't be changing value while I 
examine the execution of that function.  That's gonna make my life 
easier, in the short run as well as the long run.  On the other 
hand, if I'm debugging the traditional, loop-based, side-effects-laden
equivalent of "factorial-iterative", I've got three variables
which *are* changing values as that procedure is running, and
that makes it harder for me to follow what's going on, and therefore
harder for me to make sure that the procedure actually does what I
want it to do.  And as we now know, that's the kind of problem that
can turn a relatively benign medical linear accelerator into a 
25MeV death ray in just a few keystrokes.  

Sometimes the tail recursive solutions are not immediately 
obvious, and sometimes they just seem natural. (As Lyman pointed
out, just take a look at "my-member" to see just how obvious
the tail recursive solution can be.)  Like many things in this 
course, it's not especially hard, but it does require a different 
slant on thinking about the problem, and again like many things 
in this course, it gets better with practice.  In any case, going 
after the tail recursive solution is worthwhile, for reasons 
we've shown above.  Also, some LISP systems have the ability to 
recognize and further optimize tail recursive functions.  The 
object code that is produced is actually a simple loop (which 
you never see) with variables corresponding to the arguments being 
passed in the recursive function call.  This gives you the elegant 
expressiveness of recursion with the speed of loops or 
whatever your flavor of iterative control structure might be.  
You get all the good stuff, and none of the bad stuff.

You might argue that tail recursive solutions are harder to 
read, and I believe that's true for new LISP programmers.  
But as you gain more experience, the tail recursive solution 
becomes just another programming cliche and no longer causes 
any difficulty in understanding what's going on.  Remember that
lots of tasks look difficult the first time you encounter them,
but over time they become second nature.  When you were real 
little, your primary mode of locomotion was crawling.  At some
point you tried standing up and walking, and this was undoubtedly
much more difficult than crawling at first.  But you kept at it,
and now you get around by walking instead of crawling (except perhaps
the morning after one of those particularly intense frat parties).
The moral is simply that if you practice, you get better.

The myth of efficiency

One of the big myths that is constantly perpetuated in the 
religious wars about programming paradigms and programming 
languages is that functional programming in general (and LISP 
in particular) is to be avoided because of all this recursion 
stuff.  It's hard to learn, they say, and it's obviously 
inefficient.

It may be true that the concept of recursion itself is 
difficult to grasp, but for most folks it seems that the real 
problem is that they've been trained to think about 
repetitive computations in a certain way, using index 
variables and loops and so on.  That training interferes with 
thinking about the same problems in different ways; you'll 
feel that urge to fall back on old familiar habits.  That 
urge should diminish as you become more comfortable with the 
paradigm.  

But the knock on efficiency is in fact nothing more than 
mythology spread by folks who never learned the whole story.  
As we've just seen:

1)  If I'm thinking about it instead of acting like I'm brain
    dead, I can use tail recursion and get very efficient 
    results.  Remember that brain-dead programmers make 
    inefficient programs, regardless of their choice of 
    language or paradigm.

2)  Again, efficiency of the programmer, especially on the 
    back end of the software life cycle, is greatly improved
    by the use of recursion (and other functional programming
    ideas), assuming of course that the other folks who read
    your code also aren't brain dead.

Oh, by the way...

...there's another version of COND that's worth knowing 
about.  It's simpler than COND, but it's only useful in 
limited circumstances.  So long as there are no more than two 
alternatives, and so long as there's only one action 
associated with each alternative, IF works fine.  If your 
needs go outside these limitations, you'll want to start 
thinking about COND.

Here's the syntax for IF:

  (if *test*
      *then-action*
      *else-action*)

Here's the IF-based version of factorial:

  (defun factorial (n)
    (if (eql n 0)
        1
        (* n (factorial (- n 1)))))



Copyright 1996 by Kurt Eiselt.  All rights reserved.

Last revised: April 5, 1996