But first, just two more recursion templates
We didn't do this example in class today, but it's
still worth looking at in some depth. The problem here
is to implement our own version of LISP's "remove"
function, which takes two arguments: some possible
list element, and a list. If the possible list element is
"eql" to any top-level element of the given list, then that
element is effectively removed from the list. What is
returned is the orginial list minus any elements that are
"eql" to the possible list element. For example:
? (remove '(a b c) '((a b c) (d e f)))
((A B C) (D E F))
? (remove 'a '(a b a c))
(B C)
?
Note that while 'a is "eql" to 'a, '(a b c) is NOT "eql" to
'(a b c). However, '(a b c) is "equal" to '(a b c).
So how do we implement our own version of "remove"? The
trick here is to look at each element of the list, and if
that element is "eql" to our element to be deleted, we
recursively call "remove" on the rest of the list, thereby
discarding the element to be removed. What if they're not
"eql" and we want to keep the element? Then we use list-
consing recursion (remember...that's a form of augmenting
recursion) to "cons" that element onto the recursive call of
"remove" on the rest of the list. What we end up with is
what's called conditional augmenting recursion, a commmon way
of skipping over some elements and reconstructing with the
others. This is another standard recursive template:
(defun my-remove (element my-list)
(cond ((null my-list) nil)
((eql element (first my-list))
(my-remove element (rest my-list)))
(T (cons (first my-list)
(my-remove element (rest my-list))))))
There's at least one other standard recursion template that's worth
knowing intimately, and it's next.
Multiple recursion
The next function we'll construct isn't already defined in
Common LISP, but most good LISP hackers have one of these
sitting around in their personal library of LISP functions,
or they can at least reconstruct it on demand. I call the
function "flatten", and its purpose is to take any list,
nested arbitrarily deep, and return a simple list of all the
atoms in the original list. In other words, it takes a
hierarchical list structure and returns a linear list
structure, with the order of the atoms intact. Or, in still
other words, it removes all the parentheses except the
outermost ones:
? (flatten '(a (b (c) d (e) f)))
(A B C D E F)
?
How the heck are you supposed to make that happen? Well,
the trick is to use abstraction to break the problem into
three smaller problems: flattening the first element of the
list (which could be an atom or a list) into a linear list,
flattening the rest of the list into a linear list, and then
combining the results into a single linear list. We already
know that the way we join two lists into a single list is by
using "append", so that problem is solved, and the other two
problems are handled with recursive calls to "flatten", so
the whole thing turns out to be pretty easy. Maybe the hardest
part is figuring out what the termination conditions are:
(defun flatten (my-list)
(cond ((null my-list) nil)
((atom my-list) (list my-list))
(T (append (flatten (first my-list))
(flatten (rest my-list))))))
The first termination condition was pretty straightforward.
If "flatten" is passed an empty list, it just returns the
empty list. There's really not anything else to do, and that
empty list will just be "append"ed onto other lists,
resulting in no change whatsoever. The second termination
condition is a little bit tricky, but since I already know
that I'm using "append" to join my lists together, I also
know that if "flatten" is passed an atom, it needs to turn
that into a list so that it can be "appended" onto the other
lists that are being put together by other calls to
"flatten".
The type of recursion that's going on here has several names.
It can be called "multiple recursion" or "tree recursion", or
it can be called "car/cdr recursion", which is a special kind
of multiple or tree recursion.
"Multiple recursion" is an appropriate name because, from the
point of view of our substitution model of evaluation, each
call to "flatten" is substituted on the program stack by
multiple calls to "flatten".
"Tree recursion" is an appropriate name if we think about how
the original problem is broken down: we start by trying to
flatten the original list; that problem becomes two problems
of trying to flatten two parts of that list; each of those
problems spins off two more problems, and so on. If you were
to sit down and draw a little graph of how the original
problem was divided into two subproblems, and each of those
was further divided, and on and on, you'd get yourself a nice
little tree-like structure. (Also, note that a hierarchical
list structure can be interpreted as a tree structure, and
that "flatten" can be viewed as a preorder tree traversal
procedure. But we'll see more on that later.)
Finally, "car/cdr recursion" is appropriate in this
particular use of tree recursion because the initial problem
is broken down into recursive calls of "flatten" on the
"first" (or "car") of the list and the "rest" or ("cdr") of
the list. Today, we might be better off calling it
"first/rest" recursion, but we'd lose that sense of history
and tradition.
The important thing to remember here is that multiple or tree
recursion is a very standard way to travel around really
complex data structures, or to solve some problems with
really complex decompositions. We'll be seeing this
recursion template pop up in lots of different places.
Another way to get repetitive operations
Recursion is just a nice way to get iterative ability out of
our computer programs without adding the complexity that
comes with the introduction of counter variables and
accumulator variables and assignment operations. How do we
know that things get more complex when we add that stuff?
Well, for starters, think about that nice simple substitution
model of evaluation---it's pretty understandable, no?
Imagine what that looks like if we add variables and
assignment. If you could keep track of it all, every
substitution involves keeping track of all the variables and
what they're bound to at every given step of the way...ugh.
It's no longer simple. In fact, the substitution model
becomes so cumbersome under these conditions that we have to
come up with other models of evaluation just to accommodate
the complexity. But I digress.
There's yet another way to get repetition out of our
procedures, and that falls under the header of "applicative
programming" as opposed to "recursive programming". To
demonstrate how applicative programming works, though, we'll
start with a recursive explanation of applicative
programming.
When using a programming language whose basic data structure
is the list, it's not surprising that one of the things we
want to do most often is perform the same procedure on all
the elements of a list. We already know how to do that
recursively. Say for example, that we want to take a list of
numbers as input and return a list of corresponding square
roots. That procedure might look like this (do you recognize
the recursion template?):
(defun lotsa-sqrts (input-list)
(cond ((null input-list) nil)
(T (cons (sqrt (first input-list))
(lotsa-sqrts (rest input-list))))))
and it would work like this:
? (lotsa-sqrts '(1 4 9 16 25))
(1 2 3 4 5)
?
Now, since we often want to do repetitive operations like
this, and most of the time that operation will probably be
something other than the square root procedure, wouldn't it
be nice if we could construct a generic function to which we
could pass another function and a list of things to do be
worked on in succession by that function? Well of course it
would! And folks, note here that when I say we want to pass
it a function, I don't mean passing it a function call to be
evaluated, as in (sqrt (* 2 2))...that's different. I mean
we want to pass the whole function definition as a parameter.
Eek! How the %$#@! are we gonna do that?!
The case of the headless function
In order to make this happen, there are some things you need
to know first. Remember way back when we were touting all
the things that made LISP so nice? One of those features was
the lack of any procedure/data distinction. Here's one place
where that feature comes in handy.
Since a procedure can be treated as a piece of data, we can
decompose procedures into component parts. That means we can
separate a function body from its name. (How did that body
get that name? The "defun" function did it, remember?) In
other words, it's possible (and in many cases desirable, as
we're about to see) to use a nameless, anonymous function
definition in our computations. LISP of course has a
predefined function for separating function bodies from their
names, and it is named, ironically, "function". Here's how
it works:
Let's say you've defined a new function of your own, which in
this case takes a number as an argument and returns its
square. When you use "defun" to create that function:
(defun sqr (number)
(* number number))
you've told LISP to bind the name "sqr" to everything that
follows the name "sqr" in your definition. That association
is then stored in the symbol table. Later, when you call the
"sqr" function, LISP looks up "sqr" in the symbol table and
retrieves the function body associated with it, then uses
that function body to figure out what to do. If we want to
tell LISP at some point to explicitly go to the symbol table,
look up the "sqr" function, retrieve the function body and
return it without doing anything else with it, then we can
just say:
? (function sqr)
Now, if we're doing this in some form of Common LISP that used
to be available here, we'd see this:
> (function sqr)
(LAMBDA (NUMBER) (* NUMBER NUMBER))
>
Actually, we might see more junk in there, but it would be
implementation-dependent detail. What you see just above is
the function body in its "pure" form.
If, on the other hand, we do this in Mac CL, we might get
something that's somewhat less interesting from an
instructional viewpoint, but has the same effect:
? (function sqr)
#[Compiled-function SQR #4E5B46]
?
And if you turn off the incremental compiler, you'd get
something like:
? (function sqr)
#[Interpreted-function SQR #x5C0CBE]
?
This is also an implementation-dependent detail.
Nevertheless, let's go back to the other Common LISP
version:
> (function sqr)
(LAMBDA (NUMBER) (* NUMBER NUMBER))
>
What you're looking at is LISP's basic anonymous function
form, called the "lambda function". The "(number)" part is
the argument list, and the rest is just the function body.
Thus, a lambda function is just a function body that's not
bound to a function name.
That "function" function gets used a lot, it turns out, so
there's an abbreviation. Instead of typing
? (function sqr)
we could type
? #'sqr
and get the same result. That abbreviation is pronounced
"hash-quote".
What can you do with a lambda function?
Back to the original problem: how do we create a generic
function like "lotsa-sqrts" that isn't restricted to just
computing square roots? How do we get it to use any function
we pass it as an argument? How do we turn "lotsa-sqrts" into
"lotsa-anything"?
Let's borrow the "lotsa-sqrts" and make a few necessary
changes and see where that gets us:
(defun lotsa-anything (func-body input-list) ; two args now
(cond ((null input-list) nil) ; nothing new
(T (cons (*magic* func-body (first input-list)) ; eh?
(lotsa-anything func-body (rest input-list))))))
We know we needed to pass a function body as an argument, so
we added that argument. Everything else is pretty much the
same, except we don't know how to make the function body
passed as "func-body" do anything to the argument passed as
"input-list". All we have there is the function body, and
since LISP lets us separate function bodies from their names,
there must be some way of using a nameless function body,
right? I hope you said yes, because it's true. And that's
what we need to put in place of *magic* up above in "lotsa-
anything". (No, *magic* is not literally a predefined LISP
function. It'd be way cool if it were.)
LISP actually provides two basic functions for applying
anonymous function bodies to arguments. The first is called
(you guessed it) "apply". The "apply" function takes two
arguments, a function body and a list whose elements are the
arguments to be passed to the function body. The function
body is then applied to the that list of arguments (i.e.,
variable bindings are made, the function is then executed,
the result is returned...just like in "normal" function
evaluation). The "apply" function is the function to choose
when you don't know how many arguments to expect (i.e., the
anonymous function can take on an arbitrary number of
arguments). For example, we can say:
? (apply #'+ '(2 4 6 8))
20
?
and all is ok. Apply expects an arbitrarily long list of
arguments, and "+" can accept any number of arguments. But
if we say:
? (apply #'+ 2 4 6 8)
we get an error:
> Error: Can't construct argument list from 8.
> While executing: APPLY
> Type Command-. to abort.
See the Restarts... menu item for further choices.
1 >
[Now, we had a little confusion in class due to a comment from the
front row. "The book says that we can do (apply #'+ 2 4 6 8) and
get the correct answer." Well, it turns out that you can't, and
we all misread the example in the book while we were trying to
figure it out, so the examples I put on that %$#^&* whiteboard
were correct. But those examples don't tell the whole story about
"apply" and that's where the confusion came from. You see, the
way that "apply" is usually used is with two arguments: a function
body and a list of arguments for that function body. But, as a
"convenience" (as one book puts it), "apply" can take any number
of arguments so long as the last one is a list. So while
(apply #'+ 2 4 6 8) won't work, (apply #'+ 2 4 (6 8)) will work.
Personally, I never use "apply" in this way, and I always forget
that you can. So please accept my apologies for the confusion in
class.]
On the other hand, if you know in advance how many arguments
your function is going to expect, like in the case of "sqrt"
which always takes only one argument, we can use "funcall"
instead of "apply". The "funcall" function takes a function
body as its first argument, and the remaining arguments are
exactly those arguments needed by the function body passed as
the first argument. Thus, we can say:
? (funcall #'sqrt 25)
5
?
but we can't say:
? (funcall #'sqrt 25 36)
> Error: Too many arguments.
> While executing: SQRT
> Type Command-. to abort.
See the Restarts... menu item for further choices.
1 >
because we'd be trying to pass too many arguments to "sqrt",
and we can't say:
? (funcall #'sqrt '(25 36))
> Error: value (25 36) is not of the expected type NUMBER.
> While executing: ZEROP
> Type Command-. to abort.
See the Restarts... menu item for further choices.
1 >
because "sqrt" expects a number, not a list. That was easy,
wasn't it?
So now, which of those two functions, "apply" and "funcall",
would be the right one to put in place of *magic* in our
"lotsa-anything" function? We'd replace *magic* with
"funcall":
(defun lotsa-anything (func-body input-list)
(cond ((null input-list) nil)
(T (cons (funcall func-body (first input-list))
(lotsa-anything
func-body (rest input-list))))))
The net result is a function which maps another function onto
the elements of a list, performs that operation on each of
those elements, and returns a list of the results.
Because "lotsa-anything" sort of scoots along, mapping the
passed function body onto succeeding "first"s or "car"s of
the ever-shrinking "input-list", this function resembles a
function called "mapcar". The "mapcar" function is already
predefined in LISP, and works sort of like what we've defined
here, but it's not quite the same. So while building the
"lotsa-anything" function gives us an idea as to what the
"mapcar" and its associated mapping functions do, you should
take the time to look up "mapcar" and the others in the
Steele book and/or the Graham book and become familiar with
what's really going on there.
[There was a question in class about whether "mapcar" used
"apply" or "funcall". I couldn't find an answer in my
many manuals, so I went to the one true source...Lyman.
And Lyman told me that this is a question that might be
explored in Lab 3, so have fun.]
It's these predefined mapping functions like "mapcar" that
provide us with a second means of getting repetitive
operations without the traditional looping structure. The
mapping functions are very efficient in applying a function
in sequence to different groupings of an argument list.
Again, this style of programming is called "applicative
programming" to distinguish it from "recursion" or
"iteration".
Note:
You may get the impression from the discussion above that you
can only use apply and funcall with a named function whose body
is then extracted via #' (i.e., the function function). Not
true at all. If the sqr function is defined as follows:
? (defun sqr (x) (* x x))
SQR
then the following are equivalent:
? (apply #'sqr '(2))
4
? (funcall #'sqr 2)
4
? (apply #'(lambda (x) (* x x)) '(2))
4
? (funcall #'(lambda (x) (* x x)) 2)
4
? (apply (function (lambda (x) (* x x))) '(2))
4
? (funcall (function (lambda (x) (* x x))) 2)
4
So note that I could construct a function which builds a list
that just happens to be a lambda function, plunk that inside
a "function" function, and put that inside either "apply" or
"funcall", and thereby evaluate a lambda function that was
constructed in real time by another function. Nothing to it.
Good luck on the midterm
Copyright 1997 by Kurt Eiselt. All rights reserved.
Last revised: January 25, 1997