October 10, 1995 -- Applicative programming But first, just one more recursion template The next function we'll construct isn't already defined in Common LISP, but most good LISP hackers have one of these sitting around in their personal library of LISP functions, or they can at least reconstruct it on demand. I call the function "flatten", and its purpose is to take any list, nested arbitrarily deep, and return a simple list of all the atoms in the original list. In other words, it takes a hierarchical list structure and returns a linear list structure, with the order of the atoms intact. Or, in still other words, it removes all the parentheses except the outermost ones: ? (flatten '(a (b (c) d (e) f))) (A B C D E F) ? How the heck are you supposed to make that happen? Well, the trick is to use abstraction to break the problem into three smaller problems: flattening the first element of the list (which could be an atom or a list) into a linear list, flattening the rest of the list into a linear list, and then combining the results into a single linear list. We already know that the way we join two lists into a single list is by using "append", so that problem is solved, and the other two problems are handled with recursive calls to "flatten", so the whole thing turns out to be pretty easy. Maybe the hardest part is figuring out what the termination conditions are: (defun flatten (my-list) (cond ((null my-list) nil) ((atom my-list) (list my-list)) (T (append (flatten (first my-list)) (flatten (rest my-list)))))) The first termination condition was pretty straightforward. If "flatten" is passed an empty list, it just returns the empty list. There's really not anything else to do, and that empty list will just be "append"ed onto other lists, resulting in no change whatsoever. The second termination condition is a little bit tricky, but since I already know that I'm using "append" to join my lists together, I also know that if "flatten" is passed an atom, it needs to turn that into a list so that it can be "appended" onto the other lists that are being put together by other calls to "flatten". The type of recursion that's going on here has several names. It can be called "multiple recursion" or "tree recursion", or it can be called "car/cdr recursion", which is a special kind of multiple or tree recursion. "Multiple recursion" is an appropriate name because, from the point of view of our substitution model of evaluation, each call to "flatten" is substituted on the program stack by multiple calls to "flatten". "Tree recursion" is an appropriate name if we think about how the original problem is broken down: we start by trying to flatten the original list; that problem becomes two problems of trying to flatten two parts of that list; each of those problems spins off two more problems, and so on. If you were to sit down and draw a little graph of how the original problem was divided into two subproblems, and each of those was further divided, and on and on, you'd get yourself a nice little tree-like structure. (Also, note that a hierarchical list structure can be interpreted as a tree structure, and that "flatten" can be viewed as a preorder tree traversal procedure. But we'll see more on that later.) Finally, "car/cdr recursion" is appropriate in this particular use of tree recursion because the initial problem is broken down into recursive calls of "flatten" on the "first" (or "car") of the list and the "rest" or ("cdr") of the list. Today, we might be better off calling it "first/rest" recursion, but we'd lose that sense of history and tradition. The important thing to remember here is that multiple or tree recursion is a very standard way to travel around really complex data structures, or to solve some problems with really complex decompositions. We'll be seeing this recursion template pop up in lots of different places. Another way to get repetitive operations Recursion is just a nice way to get iterative ability out of our computer programs without adding the complexity that comes with the introduction of counter variables and accumulator variables and assignment operations. How do we know that things get more complex when we add that stuff? Well, for starters, think about that nice simple substitution model of evaluation---it's pretty understandable, no? Imagine what that looks like if we add variables and assignment. If you could keep track of it all, every substitution involves keeping track of all the variables and what they're bound to at every given step of the way...ugh. It's no longer simple. In fact, the substitution model becomes so cumbersome under these conditions that we have to come up with other models of evaluation just to accommodate the complexity. But I digress. There's yet another way to get repetition out of our procedures, and that falls under the header of "applicative programming" as opposed to "recursive programming". To demonstrate how applicative programming works, though, we'll start with a recursive explanation of applicative programming. When using a programming language whose basic data structure is the list, it's not surprising that one of the things we want to do most often is perform the same procedure on all the elements of a list. We already know how to do that recursively. Say for example, that we want to take a list of numbers as input and return a list of corresponding square roots. That procedure might look like this (do you recognize the recursion template?): (defun lotsa-sqrts (input-list) (cond ((null input-list) nil) (T (cons (sqrt (first input-list)) (lotsa-sqrts (rest input-list)))))) and it would work like this: ? (lotsa-sqrts '(1 4 9 16 25)) (1 2 3 4 5) ? Now, since we often want to do repetitive operations like this, and most of the time that operation will probably be something other than the square root procedure, wouldn't it be nice if we could construct a generic function to which we could pass another function and a list of things to do be worked on in succession by that function? Well of course it would! And folks, note here that when I say we want to pass it a function, I don't mean passing it a function call to be evaluated, as in (sqrt (* 2 2))...that's different. I mean we want to pass the whole function definition as a parameter. Eek! How the %$#@! are we gonna do that?! The case of the headless function In order to make this happen, there are some things you need to know first. Remember way back when we were touting all the things that made LISP so nice? One of those features was the lack of any procedure/data distinction. Here's one place where that feature comes in handy. Since a procedure can be treated as a piece of data, we can decompose procedures into component parts. That means we can separate a function body from its name. (How did that body get that name? The "defun" function did it, remember?) In other words, its possible (and in many cases desirable, as we're about to see) to use a nameless, anonymous function definition in our computations. LISP of course has a predefined function for separating function bodies from their names, and it is named, ironically, "function". Here's how it works: Let's say you've defined a new function of your own, which in this case takes a number as an argument and returns its square. When you use "defun" to create that function: (defun sqr (number) (* number number)) you've told LISP to bind the name "sqr" to everything that follows the name "sqr" in your definition. That association is then stored in the symbol table. Later, when you call the "sqr" function, LISP looks up "sqr" in the symbol table and retrieves the function body associated with it, then uses that function body to figure out what to do. If we want to tell LISP at some point to explicitly go to the symbol table, look up the "sqr" function, retrieve the function body and return it without doing anything else with it, then we can just say: ? (function sqr) Now, if we're doing this in some form of Common LISP that used to be available here, we'd see this: (function sqr) (LAMBDA (NUMBER) (BLOCK SQR (* NUMBER NUMBER))) If, on the other hand, we do this in Mac CL, we might get something that's somewhat less interesting from an instructional viewpoint, but has the same effect: ? (function sqr) # ? And if you turn off the incremental compiler, you'd get something like: ? (function sqr) # ? This is what's called an implementation-dependent detail. Nevertheless, let's go back to the other Common LISP version: (function sqr) (LAMBDA (NUMBER) (BLOCK SQR (* NUMBER NUMBER))) What you're looking at is LISP's basic anonymous function form, called the "lambda function". The "(number)" part is the argument list, and the "(block " part says that the previous arguments are "lexically scoped" over the rest of the function body, ending with the ")" that balances out the "(block " part. Those are all details that will make more sense later when we cover scoping issues in LISP. That "function" function gets used a lot, it turns out, so there's an abbreviation. Instead of typing ? (function sqr) we could type ? #'sqr and get the same result. That abbreviation is pronounced "hash-quote". What can you do with a lambda function? Back to the original problem: how do we create a generic function like "lotsa-sqrts" that isn't restricted to just computing square roots? How do we get it to use any function we pass it as an argument? How do we turn "lotsa-sqrts" into "lotsa-anything"? Let's borrow the "lotsa-sqrts" and make a few necessary changes and see where that gets us: (defun lotsa-anything (func-body input-list) ; two args now (cond ((null input-list) nil) ; nothing new (T (cons (*magic* func-body (first input-list)) ; eh? (lotsa-anything func-body (rest input-list)))))) We know we needed to pass a function body as an argument, so we added that argument. Everything else is pretty much the same, except we don't know how to make the function body passed as "func-body" do anything to the argument passed as "input-list". All we have there is the function body, and since LISP lets us separate function bodies from their names, there must be some way of using a nameless function body, right? I hope you said yes, because it's true. And that's what we need to put in place of *magic* up above in "lotsa- anything". (No, *magic* is not literally a predefined LISP function. It'd be way cool if it were.) LISP actually provides two basic functions for applying anonymous function bodies to arguments. The first is called (you guessed it) "apply". The "apply" function takes two arguments, a function body and a list whose elements are the arguments to be passed to the function body. The function body is then applied to the that list of arguments (i.e., variable bindings are made, the function is then executed, the result is returned...just like in "normal" function evaluation). The "apply" function is the function to choose when you don't know how many arguments to expect (i.e., the anonymous function can take on an arbitrary number of arguments). For example, we can say: ? (apply #'+ '(2 4 6 8)) 20 ? and all is ok. Apply expects an arbitrarily long list of arguments, and "+" can accept any number of arguments. But if we say: ? (apply #'+ 2 4) we get > Error: Can't construct argument list from 4. > While executing: APPLY > Type Command-. to abort. See the Restarts... menu item for further choices. 1 > On the other hand, if you know in advance how many arguments your function is going to expect, like in the case of "sqrt" which always takes only one argument, we can use "funcall" instead of "apply". The "funcall" function takes a function body as its first argument, and the remaining arguments are exactly those arguments needed by the function body passed as the first argument. Thus, we can say: ? (funcall #'sqrt 25) 5 ? but we can't say: ? (funcall #'sqrt 25 36) > Error: Too many arguments. > While executing: SQRT > Type Command-. to abort. See the Restarts... menu item for further choices. 1 > because we'd be trying to pass too many arguments to "sqrt", and we can't say: ? (funcall #'sqrt '(25 36)) > Error: value (25 36) is not of the expected type NUMBER. > While executing: ZEROP > Type Command-. to abort. See the Restarts... menu item for further choices. 1 > because "sqrt" expects a number, not a list. That was easy, wasn't it? So now, which of those two functions, "apply" and "funcall", would be the right one to put in place of *magic* in our "lotsa-anything" function? We'd replace *magic* with "funcall": (defun lotsa-anything (func-body input-list) (cond ((null input-list) nil) (T (cons (funcall func-body (first input-list)) (lotsa-anything func-body (rest input-list)))))) The net result is a function which maps another function onto the elements of a list, performs that operation on each of those elements, and returns a list of the results. Because "lotsa-anything" sort of scoots along, mapping the passed function body onto succeeding "first"s or "car"s of the ever-shrinking "input-list", this function resembles a function called "mapcar". The "mapcar" function is already predefined in LISP, and works sort of like what we've defined here, but it's not quite the same. So while building the "lotsa-anything" function gives us an idea as to what the "mapcar" and its associated mapping functions do, you should take the time to look up "mapcar" and the others in the Steele book and become familiar with what's really going on there. It's these predefined mapping functions like "mapcar" that provide us with a second means of getting repetitive operations without the traditional looping structure. The mapping functions are very efficient in applying a function in sequence to different groupings of an argument list. Again, this style of programming is called "applicative programming" to distinguish it from "recursion" or "iteration". Copyright 1995 by Kurt Eiselt. All rights reserved. -- Kurt Eiselt Assistant Dean, Student Services College of Computing Georgia Institute of Technology Atlanta, GA 30332-0280 http://www.cc.gatech.edu/aimosaic/faculty/eiselt.html