Lecture 4 -- Program control The conditional A programming language really isn't worth much if there's no way to change the flow of program control. Being able to take different branches depending on the results of some test is what makes computer programs useful. The basic mechanism for doing this is called the "conditional", and in LISP the fundamental conditional is called "cond". Here's the syntax: (cond ( ) ( ) : : ( )) If the expression evaluates to non-nil, then the "cond" function returns what the expression evaluates to. (Since is an expression, we'd expect to see a function call there, or maybe a symbol bound to some value...that sort of thing.) If evaluates to nil, the "cond" skips to , which is evaluated as above, and so on. Each test-action pair is called a "cond clause". If all the tests are evaluated in sequence, and all tests evaluate to nil, then the "cond" returns nil. While you can count on this to happen, it may not be immediately obvious to other folks who read the "cond" expression exactly what the original programmer intended to occur in this case. Good programming style in general demands that you make your intentions explicit in your code. Here, that means you should always end your "cond" with a cond clause which makes it obvious what you expect to happen when all the previous tests evaluate to nil. You do it like this: (cond ( ) ( ) : : ( ) (T )) Also, you can have more than one action in each cond clause. If the test is non-nil, the associated actions will be evaluated left-to-right, and the last expression evaluated will be the one returned by the "cond" function. (Note, though, that since we're not letting you create any side- effects by assigning values to variables yet, this feature won't be all that useful to you just now.) Predicates Common LISP provides a set of functions which are designed to execute useful tests and Boolean or true/false values depending on the outcome of the test. These are called predicates, and we use the all the time as the tests in our "cond" functions. Here are some commonly-used predicates: (null ) returns non-nil if is the empty list, nil if is not empty (atom ) returns non-nil if is an atom, nil if is not an atom (numberp ) returns non-nil if is a number, nil if is not a number (listp ) returns non-nil if is a list, nil if is not a list Historically, many functions designed to work as predicates (i.e., returning true/false values) have had the letter "p" appended to their names, hence "numberp" and "listp". Obviously, folks haven't been too consistent in this, since "atom" is not "atomp". It's quaint idiosyncrasies like this that give any language some personality, no? Sometimes, this sort of stuff filters into everyday language use. For example, one LISP hacker might ask if another is interested in going to lunch by saying simply "lunchp?"....I guess you had to be there. Equality predicates There are several equality predicates worth knowing about. In "A Programmer's Guide to Common LISP", Deborah G. Tatar explains it pretty well (pp. 48-50): "...there are four important general tests for equality. These tests take any two LISP objects as arguments, and check to see if they are equal. Naturally, two objects must be of the same type to be equal. You might wonder why four tests are necessary. Why doesn't one test serve the purpose? The reason is that there are degrees of equality. Most of the time you want to know whether two objects look the same, but sometimes you have to know whether they are actually the same object in memory. That accounts for two of the tests. Then, as it turns out, minor modifications on each of the major tests make two more surprisingly useful functions. EQUALP and EQUAL are the more general equality predicates. A good rule of thumb is that two objects are EQUALP or EQUAL if they look the same when they are printed on the screen. : : The difference between EQUAL and EQUALP is that EQUALP is less pure in its definition of equality. Simply because it turns out to be useful, EQUALP ignores differences in case in characters and type in numbers. For example, (equal 3 3.0) NIL but (equalp 3 3.0) T Also, (equal "YES" "yes") NIL (equalp "YES" "yes") T The last example demonstrates one of the instances in which EQUALP is useful; if you had solicited user input, you probably wouldn't care whether it was typed in lower-, or uppercase letters, or both. The other two equality predicates, EQ and EQL, tell you whether you are looking at two objects in memory or at one. Why do we need operators like these? Consider the following calls and returned values: (equal (cons 'a 'b) (cons 'a 'b)) T (equalp (cons 'a 'b) (cons 'a 'b)) T These might look like good answers, and for many purposes they are; however, consider that CONS is a function that performs an operation. Each time you call CONS, a new cons cell is constructed. The contents of two cons cells may be the same or look the same but they are separate objects, just as twins who have DNA with the same sequence of nucleotides are still separate persons. EQ and EQL test whether two objects not only look alike, but whether they are the same, that is, located in the same place in memory. In other words, (eq (cons 'a 'b) (cons 'a 'b)) NIL (eql (cons 'a 'b) (cons 'a 'b)) NIL This kind of test is important when you have the ability to change objects. Then you often need to know whether both items will change, or only one. : : One characteristic difference between EQ and EQL has to do with the way LISP handles numbers. EQ returns true only if two numbers are in exactly the same location in memory. Small numbers (called FIXNUMS) have a direct representation in memory, and are always EQ. However, LISP must create a representation for very large numbers (BIGNUMS) and for floating-point numbers each time they are used. Therefore, they may not be EQ. It turns out that much of the time you won't care about exact identity in that case. Furthermore, the number of fixnums is implementation-dependent. EQL is provided as a portable version of EQ. For example, in a given implementation of LISP: (eq 1234567890 1234567890) may return T or NIL, but: (eql 1234567890 1234567890) always returns T. The difference between EQ and EQL is rather subtle; in fact, the only reason for introducing EQL at this early stage is that it is the default test that LISP functions use to test for equality." Using "cond" -- an example Let's say we want to define a function which tells us if a given item is an element of a given list. This turns out to be a very useful function, and it already exists in Common LISP. It's called "member". But even though it already exists, we want the practice, so we're going to construct our own version. And to make sure we don't inadvertently replace LISP's version with our own possibly buggy version, we'll give ours a distinctive name. We'll call it "my-member". What will the design look like? We can sketch it out with a combination of the LISP syntax we already know, and some English where we're not sure about the LISP yet. Here's the first cut: (defun my-member (input-item input-list) if done then return "no" else if input-item = first element of input-list then return "yes" else what? see if input-item = next thing on input-list? how? ) OK, so how are we going to turn all that "if-then-else" stuff into a "cond"? (defun my-member (input-item input-list) (cond (done then return "no") (input-item = first element of input-list then return "yes") (what? see if input-item = next thing on input-list? how? ) ) ) Hmmm. That looks a little more like LISP, but it sure won't run on my Macintosh. What looks like something that's going to be real easy to turn into LISP? How about that test to see if input-item is the same as the first element of input- list? That should be easy. Just remember the "cond" syntax: (defun my-member (input-item input-list) (cond (done then return "no") ((eql input-item (first input-list)) then return "yes") (what? see if input-item = next thing on input-list? how? ) ) ) And how do we return "yes" in that case? (defun my-member (input-item input-list) (cond (done then return "no") ((eql input-item (first input-list)) T) (what? see if input-item = next thing on input-list? how? ) ) ) Nothing to it. How are we going to test if we're done? Well, if we just sort of walk along input-list, testing the individual elements to see if they match input-item, what would be the termination point? When we run out of input- list, or, in other words, when input-list is nil. So now we can translate more English into LISP: (defun my-member (input-item input-list) (cond ((null input-list) nil) ((eql input-item (first input-list)) T) (what? see if input-item = next thing on input-list? how? ) ) ) Wow. Now I have more LISP than English. But there's still one missing chunk. How do I get this thing to repeat for every element of input-list (or at least until I match input- item)? If we were piddling around with Pascal, we'd want to create some sort of loop structure, and maybe create a variable or two, and throw in an assignment operation here and there...make it really complicated, and in the process make ourselves feel good about how much mastery we have over our computer. Grrrrr. Well, that's not gonna happen here. Not today at least. We're going to use a very elegant and computationally pure form of iteration which LISP supports very nicely. It's called recursion. Recursion "Recursion" essentially means defining something in terms of itself. A function is recursive if it (directly or indirectly) calls itself. A recursive function consists of three parts: 1) the termination condition, or when to stop 2) the operation or modification, or what to do to the input to move closer to a termination condition 3) the recursive call itself. Recursion is a program control mechanism that allows repetitive operations without traditional iteration, which requires the use of side effects and the maintenance of variables as counters or temporary storage places...things which add unnecessary complexity. Using recursion effectively requires a different style of thinking, but you'll get better at it with practice if you find it difficult early on. Recursion also results in nice, clean, compact source code which is often easier to read than the iterative equivalents. A recursive function can also eat up lots of memory as it is running; we'll see more of this later. Let's go back now and finish "my-member". What do we want to do? With "my-member", we're trying to build a function which does some operation on all the elements of a list, until we find a specific element. If we're thinking recursively, we want to break this up into a couple of smaller problems (there's that abstraction thing again): 1) performing that operation of one element of the list, combined somehow with... 2) calling the function just defined on the remainder of the list So let's apply all this thinking about recursion to "my- member". So far, we've already coded two different termination conditions: stopping when we get to the end of input-list without finding a match, and stopping when we find a match with input-item. And the test to see if we find a match between input-item and the first element of input-list is effectively the "performing that operation of one element of the list" that we just mentioned. But if neither of those conditions is true, what do we want to do? We want to call "my-member" on the remainder of input-list, since that will get our matching operation performed on the next element of the list, while at the same time reducing the size of input- list and thereby getting us closer to a termination condition. The end result looks like this: (defun my-member (input-item input-list) (cond ((null input-list) nil) ((eql input-item (first input-list)) T) (T (my-member input-item (rest input-list))))) Oh, one other thing. When "my-member" finds a match, it returns T. But when Common LISP's "member" function returns a match, it returns that part of input-list which begins with input-item. That's also a non-nil result, so it has the same Boolean value, but it gives us more information than just "true" or "false". You'll find that LISP tries to do that a lot, and you should think about doing it too when you can. To make "my-member" work that way, it would be changed to this: (defun my-member (input-item input-list) (cond ((null input-list) nil) ((eql input-item (first input-list)) input-list) (T (my-member input-item (rest input-list))))) Analyzing the process Let's look at another example. In this case, the example is a classic when it comes to educating folks about recursion. It's computing the factorial of an integer. The factorial function is defined in one of two ways: n! = n * (n-1) * (n-2) * ... * 2 * 1 if n > 0 1 if n = 0 or n! = n * (n-1)! if n > 0 1 if n = 0 The first definition suggests a traditional iterative approach to computing the factorial, while the second one feels much more recursive. So let's implement the second one: (defun factorial (n) (cond ((eql n 0) 1) (T (* n (factorial (- n 1)))))) If we apply our substitution model of evaluation, we can analyze what happens when we execute this program. When we first invoke factorial, say on the integer 4, what goes on the program stack is the equivalent of this (we'll use "fact" instead of "factorial" to save space): | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |(fact 4)| | | | | | ------------------------------------------------------- Evaluating (fact 4) results in replacing (fact 4) with the multiplication function and two arguments, the integer 4 and a function call of (fact 3): | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |(fact 3)| | | | | | | 4 | | | | | |(fact 4)| * | | | | | ------------------------------------------------------- Again, what's on top of the stack gets evaluated and is subsequently replaced in our substitution model of evaluation: | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |(fact 2)| | | | | | | 3 | | | | | |(fact 3)| * | | | | | | 4 | 4 | | | | |(fact 4)| * | * | | | | ------------------------------------------------------- This repeated substitution continues until we get to (fact 0), which is a termination condition evaluating to 1: | | | | | | | | | | | |(fact 0)| 1 | | | | | | 1 | 1 | | | | |(fact 1)| * | * | | | | | 2 | 2 | 2 | | | |(fact 2)| * | * | * | | | | 3 | 3 | 3 | 3 | | |(fact 3)| * | * | * | * | | | 4 | 4 | 4 | 4 | 4 | |(fact 4)| * | * | * | * | * | ------------------------------------------------------- So we hit the termination condition of our recursion, and in the process of "unwinding" this recursion we return the values of all these stacked or postponed computations: | | | | | | | | 1 | | | | | | | 1 | | | | | | | * | 1 | | | | | | 2 | 2 | | | | | | * | * | 2 | | | | | 3 | 3 | 3 | | | | | * | * | * | 6 | | | | 4 | 4 | 4 | 4 | | | | * | * | * | * | 24 | | ------------------------------------------------------- Here's a slightly different notation for the substitution model of evaluation. Maybe it'll give you a different perspective on what's happening. It looks sort of like the results of using the TRACE function: (fact 4) (* 4 (fact 3)) (* 4 (* 3 (fact 2))) (* 4 (* 3 (* 2 (fact 1)))) (* 4 (* 3 (* 2 (* 1 (fact 0))))) (* 4 (* 3 (* 2 (* 1 1)))) (* 4 (* 3 (* 2 1))) (* 4 (* 3 2)) (* 4 6) 24 The interesting thing to note here is "shape" of the growth curve of the use of the program stack. Each time the recursive call is made, we use up another chunk of program stack. In fact, the use of memory grows linearly with the integer n in the call to factorial (i.e., memory use is O(n) for this algorithm). What we see above is not just a picture of the factorial procedure (i.e., the code, the program, the algorithm), it's a look at the behavior of the "process"---the procedure in execution. They are different, and it's important to be aware of the distinction. In this case, we see a classic pattern of process behavior, and it even has a name. Our recursive procedure gives rise to what's called a "linear recursive process". O(n) memory use is not exactly something to be proud of. We'd really like to do better than that. "Uh?" you might be asking, "Is this the guy who told us not to worry about saving a cycle here or a byte there for the sake of efficiency?" Yes, it's still me, but here we're looking at saving much more than a few cycles or a few bytes. Here we're questioning whether we'll be able to compute factorials for large integers without running into limitations of available memory, and that's a whole different problem. Let's see if we can fix it. Tail-recursion If we go back and look at the shape of the factorial process above, we see that the real culprit in our excessive memory use is that each call to factorial gets replaced by a multiplication and another call to factorial. We could stop this from happening if we could guarantee that each call to factorial is replaced only by another call to factorial (or something like it). Here's how it's done: (defun factorial (n) (factorial-iterative 1 1 n)) (defun factorial-iterative (product counter max-count) (cond ((> counter max-count) product) (T (factorial-iterative (* counter product) (+ counter 1) max-count)))) If we look at the behavior of the process that results from this procedure, we get an entirely different shape: | | | | | |(fact 4)|(fact-it 1 1 4)|(fact-it 1 2 4)|(fact-it 2 3 4)| => ---------------------------------------------------------- | | | | |(fact-it 6 4 4)|(fact-it 24 5 4)| 24 | --------------------------------------- Or, using the other notation: (fact 4) (fact-it 1 1 4) (fact-it 1 2 4) (fact-it 2 3 4) (fact-it 6 4 4) (fact-it 24 5 4) 24 Now, isn't that nice? No growth in program stack space at all, and we didn't trade away an increase in execution time to get this either, as there are the pretty much the same number of function calls, multiplications, and so on. So in this case, our memory usage is O(1) instead of O(n), and that is something to be proud of. The shape is just a horizontal line, which is characteristic of what's called a "linear iterative process" (note that the process can be iterative, referring to how the shape is produced, even though the procedure itself is defined recursively). To accomplish this, we used a type of recursion called "tail recursion". In LISP, we obtain tail recursion by making sure that the step in our procedure which is the recursive step does not have any "work" being done "outside" the recursive function call. Compare the corresponding steps from the two different function definitions. Here's the first version: (T (* n (factorial (- n 1)))))) That "(* n " part is what was killing us in terms of memory usage. This kind of recursion is called "augmenting recursion", meaning that there's some additional computation added on to the recursive call, and that makes the program stack grow. In the tail recursive version, however, there's no augmenting recursion: (T (factorial-iterative (* counter product) (+ counter 1) max-count)))) All the computation is done within the arguments to the function call, and so the substitutions are one-for-one and don't cause any growth of the program stack. Here's a slightly different perspective which may help you understand the difference between these two types of recursion. The first version of the factorial function postponed arithmetic operations as it progressed, and to postpone those operations, LISP had to remember what those operations were. To remember all that stuff, LISP had to use a chunk of memory on the program stack for each postponed operation. But when the tail-recursive version of factorial is running, no arithmetic operations are being postponed. All computations are being done as they're needed, so no operations must be remembered, and consequently there's no corresponding growth in usage of the program stack. In order to make this work, though, we had to introduce the equivalent of variables as place holders for these embedded computations, and we did this by introducing some additional arguments. Those arguments in this case were established through the creation of a helping function, which initializes the arguments being used as variables: (defun factorial (n) (factorial-iterative 1 1 n)) The arguments themselves take on the roles of the sorts of things you'd expect to see in a traditional iterative solution: (defun factorial-iterative (product counter max-count) ... Here, "product" is used as the place where you store your cumulative product, and it is initialized to the multiplicative identity (i.e., 1). "counter" is just your index variable, and "max-count" is the limit on the number of "iterations" you want this thing to perform. Sometimes the tail recursive solutions are not immediately obvious, and sometimes they just seem natural. Like many things in this course, it's not especially hard, but it does require a different slant on thinking about the problem, and again like many things in this course, it gets better with practice. In any case, going after the tail recursive solution is worthwhile, for reasons we've shown above. Also, some LISP systems have the ability to recognize and further optimize tail recursive functions. The object code that is produced is actually a simple loop (which you never see) with variables corresponding to the arguments being passed in the recursive function call. This gives you the elegant expressiveness of recursion with the speed of loops or whatever your flavor of iterative control structure might be. You get all the good stuff, and none of the bad stuff. You might argue that tail recursive solutions are harder to read, and I believe that's true for new LISP programmers. But as you gain more experience, the tail recursive solution becomes just another programming cliche and no longer causes any difficulty in understanding what's going on. Remember that lots of tasks look difficult the first time you encounter them, but over time they become second nature. When you were real little, your primary mode of locomotion was crawling. At some point you tried standing up and walking, and this was undoubtedly much more difficult than crawling at first. But you kept at it, and now you get around by walking instead of crawling (except perhaps the morning after one of those particularly intense frat parties). The moral is simply that if you practice, you get better. The myth of efficiency One of the big myths that is constantly perpetuated in the religious wars about programming paradigms and programming languages is that functional programming in general (and LISP in particular) is to be avoided because of all this recursion stuff. It's hard to learn, they say, and it's obviously inefficient. It may be true that the concept of recursion itself is difficult to grasp, but for most folks it seems that the real problem is that they've been trained to think about repetitive computations in a certain way, using index variables and loops and so on. That training interferes with thinking about the same problems in different ways; you'll feel that urge to fall back on old familiar habits. That urge should diminish as you become more comfortable with the paradigm. But the knock on efficiency is in fact nothing more than mythology spread by folks who never learned the whole story. As we've just seen: 1) If I'm thinking about it instead of acting like I'm brain dead, I can use tail recursion and get very efficient results. Remember that brain-dead programmers make inefficient programs, regardless of their choice of language or paradigm. 2) Again, efficiency of the programmer, especially on the back end of the software life cycle, is greatly improved by the use of recursion (and other functional programming ideas), assuming of course that the other folks who read your code also aren't brain dead. Oh, by the way... ...there's another version of COND that's worth knowing about. It's simpler than COND, but it's only useful in limited circumstances. So long as there are no more than two alternatives, and so long as there's only one action associated with each alternative, IF works fine. If your needs go outside these limitations, you'll want to start thinking about COND. Here's the syntax for IF: (if ) Here's the IF-based version of factorial: (defun factorial (n) (if (eql n 0) 1 (* n (factorial (- n 1))))) Copyright 1995 by Kurt Eiselt. All rights reserved. -- Kurt Eiselt Assistant Dean, Student Services College of Computing Georgia Institute of Technology Atlanta, GA 30332-0280 http://www.cc.gatech.edu/aimosaic/faculty/eiselt.html