I. Removing all occurrences of some item from a list
Our next sample problem is to implement a function which takes two arguments:
some possible list element, and a list. If the possible list element is
"equal?" to any top-level element of the given list, then that element is
effectively removed from the list. We'll call the function "remove". You may
recognize it from your homework. What is returned is the original list minus
any elements that are "equal?" to the possible list element, and original
order of the list elements that remain must be retained. For example:
> (remove '(a b c) '((a b c) (d e f)))
((d e f))
> (remove 'a '(a b a c))
(b c)
But, note what happens, or doesn't happen, here:
> (remove 'a '(a b (a) c))
(b (a) c)
So how do we implement remove? The trick here is to look at each element of
the list, and if that element is "equal?" to our element to be deleted, we
recursively call "remove" on the rest of the list, thereby discarding the
element to be removed. What if they're not "equal?" and we want to keep the
element? Then we "cons" that element onto the recursive call of "remove" on
the rest of the list. What we end up with is could be called conditional
list-consing recursion (or maybe conditional augmenting recursion), a common
way of skipping over some elements and reconstructing with the others. It's
sort of a standard recursive template:
(define (remove item input-list)
(cond ((null? input-list) ())
((equal? item (car input-list))
(remove item (cdr input-list)))
(else (cons (car input-list)
(remove item (cdr input-list))))))
As you all recognize, I'm sure, this example has the potential for lots of
stack usage. Thus, if the list passed as the parameter "input-list" is really
really long, the stack will get really really big. That would seem to be the
kind of inefficiency we should be worried about, especially if we don't know
that the Scheme interpreter is optimizing for us, so we'll want to construct
a tail-recursive version of remove. This is pretty straightforward too: we
use the helper function to introduce a counter variable, and once we do that
it's easy to do the real work inside the recursive call and avoid postponing
any computation on the stack:
(define (remove-2 item input-list)
(reverse (remove-2-helper item input-list ())))
(define (remove-2-helper item input-list result)
(cond ((null? input-list) result)
((equal? item (car input-list))
(remove-2-helper item (cdr input-list) result))
(else (remove-2-helper
item
(cdr input-list)
(cons (car input-list) result)))))
If you're really worried about efficiency, you might say "EEK!" when you
study remove-2. "Why do we cons the resulting list together backwards and
then reverse it? Why don't we just build the list in the right order in the
first place by appending things to the back end of the list, and eliminate
the extra work performed by that call to reverse?" you might ask. That's a
good question. Let's see what that looks like:
(define (remove-3 item input-list)
(remove-3-helper item input-list ()))
(define (remove-3-helper item input-list result)
(cond ((null? input-list) result)
((equal? item (car input-list))
(remove-3-helper item (cdr input-list) result))
(else (remove-3-helper
item
(cdr input-list)
(append result
(cons (car input-list) () ))))))
Hey, that looks great! The code looks pretty much the same except that pesky
reverse is gone, and now you're adding things to the end of the list you're
building instead of adding things to the front. It's so obvious, yet so
elegant.
And so horribly misguided...
II. Append, an alternative to cons
Let's take a closer look at "append" and how it compares to "cons". Append
works a bit differently than cons, with the biggest difference being that
append lets you add things to the "back end" of a list, whereas cons adds
things to the "front end" of a list. Another difference is that while cons
is a primitive function that's embedded in the Scheme system, so that you
have no way of writing your own cons function, append can be (and is)
defined in terms of cons, so you can write your own version of append. What
does append do? Here are some examples:
> (append '(a b) '(c d))
(a b c d)
> (append () '(c d))
(c d)
> (append '(a b) ())
(a b)
> (append 'a '(c d))
append: expects argument of type ; given a
> (append '(a b) 'c)
(a b . c)
>
Informally, what append does is to join two lists together, as in the first
three examples above. Given two proper lists as arguments, append returns a
single list (or more correctly, a pointer to that single list) that is made
up of the elements of the first list followed by the elements of the second
list. Append insists that the first argument is a proper list, as you can
see in the fourth example; that is, append can only append the second argument
to the first argument if the first argument is a proper list. On the other
hand, the second argument doesn't have to be a proper list, as you can see
in the fifth and final example. However, if that second argument is not a
proper list, the result that append returns will be an unsightly dotted pair.
Now that you know what append does, the obvious question is how does append
do it? Here's a simple implementation of append using cons:
(define (my-append list1 list2)
(cond ((null? list1) list2)
(else (cons (car list1)
(my-append (cdr list1) list2)))))
Let's trace the behavior of my-append using the substitution model of e
valuation. For an example, let's say we're starting with the list (A B C),
and we'd like to add D to the end of that list, so that append returns
(A B C D). If we do this:
> (my-append '(A B C) 'D)
We get a dotted pair:
(A B C . D)
and we don't want that. So we have to put D inside a list to get what we want:
> (my-append '(A B C) '(D))
(A B C D)
Here's the trace of what just happened:
(my-append '(A B C) '(D))
(cons 'A (my-append '(B C) '(D))
(cons 'A (cons 'B (my-append '(C) '(D)))
(cons 'A (cons 'B (cons 'C (my-append '() '(D))))
(cons 'A (cons 'B (cons 'C '(D))))
(cons 'A (cons 'B '(C D)))
(cons 'A '(B C D))
(A B C D)
And if you trace it yourself in Dr. Scheme, you'll see this:
> (trace my-append)
(my-append)
> (my-append '(a b c) '(d))
|(my-append (a b c) (d))
| (my-append (b c) (d))
| |(my-append (c) (d))
| | (my-append () (d))
| | (d)
| |(c d)
| (b c d)
|(a b c d)
(a b c d)
>
You should note that as append, or my-append in this case, conses A onto the
consing of B onto the consing of C onto the second argument, list2, which is
(D), what append is really doing is making a copy of the first argument,
list1, which it then connects to the argument passed as list2. So append
(or my-append) starts with two lists that look like this:
list1 list2
| |
| |
| _______ _______ _______ | _______
+-->| | | | | | | | | +-->| | /|
| | | --+---->| | | --+---->| | | | | | | / |
|_|_|___| |_|_|___| |_|_|___| |_|_|/__|
| | | |
A B C D
but when append is done, it returns a pointer to the beginning of the copy of
list1 that it made, and at the end of that copy we find a pointer to the
beginning of list2:
list1 list2
| |
| |
| _______ _______ _______ | _______
+-->| | | | | | | | | +-->| | /|
| | | --+---->| | | --+---->| | | | +-->| | | / |
|_|_|___| |_|_|___| |_|_|___| | |_|_|/__|
| | | | |
A B C | D
|
here's what append returns |
| |
| |
| _______ _______ _______ |
+-->| | | | | | | | | |
| | | --+---->| | | --+---->| | | --+-+
|_|_|___| |_|_|___| |_|_|___|
| | |
A B C
Now let's look at an example of how cons and append might be used in a
simple function.
III. Append: the black hole of efficiency
Joining lists is a very useful thing, so "append" will no doubt place high on
your list of frequently-used functions. Young Schemers are prone to overusing
"append" however. It's often the case that Scheme programmers want to
construct lists (duh), and they see two different ways of doing that. One way
is to build a list by "cons"ing elements to the front of some initial list
(often the empty list):
(cons element initial-list)
while the other way is to build a list by "append"ing elements to the end of
the initial list:
(append initial-list (list element))
At first glance, these operations seem to do the same sorts of things, the
only difference being that they work on opposite ends of the list.
Consequently, you might infer that these two operations are roughly
equivalent in terms of how they work and the costs that they incur. Nothing
could be further from the truth. Take a careful look at the implementation of
append, using cons, that we developed above:
(define (my-append list1 list2)
(cond ((null? list1) list2)
(else (cons (car list1)
(my-append (cdr list1) list2)))))
Looking at this version of append, you can see that appending something onto
the back end of a long (or growing) list is computationally expensive. You
have to cons together a copy of the first list before you can run a pointer
from the end of that copy to the second list. In other words, append rebuilds
an entire copy of the first list, cons cell by cons cell, in order to reach
the end of the first list and connect it to the beginning of the second list.
That's a lot of work just to add something onto the end of a list.
The thing to keep in mind is that you often have a choice when it comes down
to how you build a list...you can either build it by consing elements to the
front of the empty list (and then "reverse" the list if necessary), or you
can build it by appending elements to the back of the empty list (which often
seems more intuitively appealing). But if you're building really big lists,
it turns out that the latter approach is computationally much more expensive
than the former. So if you have a choice of building a list by consing
elements to the front and reversing the list once versus appending elements
to the end, make sure you use the cons-and-reverse approach. It's not just a
little savings in both time and cons cells, it's a big big big (potentially
exponential) savings in time and cons cells. And it has nothing to do with
whether "append" is implemented with augmenting recursion or tail recursion
(that's about stack space, not time or cons cells...but you could use this
tail-recursive version of append if it will make you feel better:
(define (my-append-tr list1 list2)
(my-append-tr-helper (reverse list1) list2))
(define (my-append-tr-helper list1 list2)
(cond ((null? list1) list2)
(else (my-append-tr-helper (cdr list1)
(cons (car list1) list2)))))
(But now you need a tail-recursive version of reverse. That's coming up soon.)
How much of a savings do you get? Keep reading.
IV. Fun with mathematics
Let's go back to our last two "remove" functions and see just how big a
difference there is here. Let's say we want to remove all occurrences of some
symbol from a 1,000 element list. And just for fun, let's examine the extreme
case where the thing I want to remove from the list turns out not to be in
the list in the first place. In the process of finding that there's nothing
to remove, remove-2 performs 1,000 conses as it builds a new copy of the
original list. But that new list has been built in reverse order, so we need
to call the "reverse" function to turn it around. How does reverse work? Like
this:
(define (my-reverse revlist)
(my-reverse-helper revlist ()))
(define (my-reverse-helper revlist result)
(cond ((null? revlist) result)
(else (my-reverse-helper (cdr revlist)
(cons (car revlist)
result)))))
That's a nice, efficient tail-recursive version of reverse, and we'll assume
that's how Scheme's reverse is implemented, so that we don't throw our
comparisons off.
And now that you see how reverse works, you can see that to reverse a 1,000
element list will again take 1,000 conses. So trying to remove an item from a
1,000 element list can cost as many as 2,000 conses, using remove-2:
(define (remove-2 item input-list)
(reverse (remove-2-helper item input-list ())))
(define (remove-2-helper item input-list result)
(cond ((null? input-list) result)
((equal? item (car input-list))
(remove-2-helper item (cdr input-list) result))
(else (remove-2-helper
item
(cdr input-list)
(cons (car input-list) result)))))
Seems like about 1,000 conses too many, no? If we worked with a 2,000 element
list, it could take as many as 4,000 conses. And if we worked with a list of
length n, it could take as many as 2n conses, with half those conses being
spent on just putting the list back in the original order.
Now let's see what happens if we do the same thing with remove-3:
(define (remove-3 item input-list)
(remove-3-helper item input-list ()))
(define (remove-3-helper item input-list result)
(cond ((null? input-list) result)
((equal? item (car input-list))
(remove-3-helper item (cdr input-list) result))
(else (remove-3-helper
item
(cdr input-list)
(append result
(cons (car input-list) () ))))))
At the very beginning, we compare the thing we're trying to remove to the
first element of the original list. It's not there, so remove-3 makes a list
out of that first element, which you can do by consing it to the empty list,
so let's say that costs us one cons, and then appends that new one-element
list to the end of the empty list. Appending to the end of the empty list
doesn't cost any conses, so the total cost is one cons. Now remove-3 compares
the item to be removed to the second element of the original list and there's
no match. So remove-3 makes a list out of that second element and appends
that list to the end of the one-element list from the previous step.
Appending to a one-element list costs one cons, so the cost of seeing if we
want to remove the second element of the original list is two conses. Now for
the third element of the original list: it's not a match, so we make a list
out of it by expending one cons, and we append that list to the two-element
list from the step before at a cost of two conses, for a total expenditure of
three conses. So in review, to process the first element costs 1 cons,
processing the second element costs 2 conses, and processing the third
element costs 3 conses, for a total of 6 conses so far.
There's a pattern developing here that will eventually show us that removing
an item from a 1,000 element list could cost 1 + 2 + 3 + ... + 1,000 conses.
More generally, removing an item from an n-element list could cost as much as
1 + 2 + 3 + ... + n-1 + n conses. There's a shorter equivalent for that
sum:
1 + 2 + 3 + ... + n-1 + n = n * (n + 1) / 2
So, if we plug 1,000 in for n, we get the cost of removing an item from a
1,000 element list as being 500,500 conses for remove-3, compared to 2,000
conses for remove-2. That "extra" work done by reverse in remove-2 doesn't
seem so bad now, does it? But what's a half-million conses on today's
blazingly fast machines? No big deal. But what if n gets really big? For
example, let's say we've been working in that dream job at the Social
Security Administration for awhile now, and I want to use remove to delete
a record from that master file (undoubtedly the Social Security folks are
using Dr. Scheme and the names are all stored in one big linked list, no?).
The population of the United States is very roughly 280,000,000 or so these
days, so let's set n conservatively at 280,000,000. If I try to remove the
record using remove-2, that's a worst case of 560,000,000 conses. If my
computer can perform, say, 1,000,000 conses per second, that whole thing will
take 560 seconds or 9 minutes and 20 seconds. If I do that same operation
with remove-3, that's gonna be 39,200,000,140,000,000 conses. Assuming I have
enough memory to do all those conses, if I can do 1,000,000 conses per
second, that's 39,200,000,140 seconds. If there are 31,536,000 seconds in a
year, that's just slightly more than 1243 years! Yes, that name will be
deleted from the Social Security master list sometime around the year
3246...and that's kind of slow, even for the federal government. And of
course, you know the power is gonna go out at least once during that time,
and you'll have to start over....
I guess the good news is that since I was using tail recursion, the
activation stack didn't have to grow.
V. Analyzing the cost of a bad decision
Let's go back and think about our comparison of building a list by consing to
the front of the list to building a list by appending to the end of the list.
If there were no difference in how much work these two operations do to get
the job done, then we wouldn't have had much of a discussion. But in fact,
cons and append, even though they both add to ends of a list, work in
entirely different ways. That was one of the points of the whole
discussion---if you don't know what the operations in your programming
language, any programming language, do, then you run the risk of making very
bad decisions. In this case, appending to the end of a list is so much more
expensive than consing to the front that even if you had to reverse the list
you had cons'ed together to get it to look right, you'd still gain a big
savings over appending, if you're working with any list with more than about
3 or 4 items.
But there are a bunch of other factors that impact this analysis, and you
should know about them too. For example, the fact that lists in Scheme give
you immediate access to the front of the list via a pointer, but not to the
end of the list, is a big factor. If you had a version of Scheme, for
instance, that also provided you with a pointer to the end of the list, you
could have an append operation that would tack something on the end of the
list without having to start at the front of the list and work its way down.
It's not likely that Scheme will ever provide you with that ability, but I
suppose you could write your own. In any case, it's important to note that
the way a given abstract data type is implemented in some language is
something else you need to know intimately.
And then there's the fact that we chose to do our analysis using a worst case
scenario. Oddly enough, in this particular example, the worst case is when
the thing you're trying to remove from the list doesn't exist in the list.
That means the functions end up rebuilding the entire input list...nothing is
left out. The best case would be if the input list consisted of nothing but
the thing you were trying to remove. For example, if you wanted to remove the
symbol 'x from this list '(x x x ... x x x), the thing that gets returned is
the empty list. Nothing was cons'ed (or appended) to the empty
list...everything was removed. So, even though cycles are being expended,
there aren't any cons'es being performed, so our analysis shows that little
if any work is being done in this case. More interesting would be some sort
of average-case analysis, but it's generally very hard to come up with an
acceptable average case. So folks tend to lean toward worst-case scenarios,
since they're usually easier to figure out.
In summary, the big message here is that being a really good computer
programmer is more than just translating ideas into instructions. You have to
understand how the instructions work, you have to understand how the data
structures work, and you have to understand how the program is likely to be
used or abused...what are the normal cases, and what are the pathological
cases. This is what makes the difference between a second-rate hack and a
first-rate computer scientist. This is where the science part of computer
science comes into play. You have to understand it all, not just the part
that you like. You have to do the math.
VI. How to talk about the math
If we assume that consing is the big expense in these particular problems,
then we can say that the work that remove-2 does when given a list of N
elements is 2N. Another way of saying that in computer science terms is that
the time complexity of remove-2 is on the order of 2N, which is written
as O(2N), and pronounced "Oh of 2N" or "Big Oh of 2N". When we're talking
about orders of complexity, we simplify by throwing away constants. In other
words, for this particular case, O(2N) just tells us that work is directly
proportional to the length of the input list , and that's a linear
relationship, so all we care about is that the complexity is O(N), and
whether it's 2*O(N) or 3*O(N) or any constant times O(N), that's just noise
when N gets really really big.
In the case of remove-3, the number of conses done for a list of N elements
is 1/2(N*N +N). As N gets really big, the constant multiplier of 1/2 doesn't
mean much, and we can throw that away. Similarly, even though N is really
big, it's nothing compared to N*N, and the distance between the two values
grows and grows as N gets bigger and bigger, so we can ignore the smaller
term when comparing this behavior to remove-2's behavior. For remove-3, the
time complexity is on the order of N-squared, or O(N^2) since I can't show a
superscript 2 here. As we've seen, O(N^2) complexity is something to avoid if
we can, especially if N is going to get big. (For little Ns, it may not make
any difference at all.) In our example cases, the amount of time these
different approaches take is directly related to the amount of memory
necessary (since a cons eats up not only a fixed chunk of time but a fixed
chunk of memory). So in terms of both time and memory, remove-2 exhibits
O(N) behavior and remove-3 exhibits O(N^2) behavior. But keep in mind that
it's not always the case that time requirements and memory usage will grow
hand in hand. There are lots of cases where you will see big differences
between time usage and memory usage; in fact, programmers are often faced
with a time/space tradeoff, where using up more of one resource will let you
conserve the other resource.
VII. What if we make the hardware faster?
Won't faster processors make this problem just go away? No, not really. Let's
say you replace your 1,000,000 cons per second processor with a 10,000,000
cons-per-second processor. Now it only takes 124.3 years to finish the work.
That's not really acceptable either. So maybe we can add another order of
magnitude speed boost and go to 100,000,000 conses per second. Now we're down
to 12.4 years. That's still not acceptable. How about a billion conses per
second. Now we're screamin'. We're down to 1.24 years...still bad. At the
same time, there's nothing to stop the problem from getting bigger. What if
the Social Security database gets bigger? Any gains we made by adding faster
or more processors just get eaten up if the size of the problem increases.
And the problem size always gets bigger. Always. More or faster hardware is
seldom if ever the solution if bad software design is the problem.
VIII. The moral to the story
Experience tells us that some students encountering a new programming
language for the first time learn about those languages in less than
beneficial ways. For example, a student might be working on a programming
problem in this new language, and say "Hmmm. I need to remove every third
element from this data structure and then split it into two halves. I wonder
if there's an instruction that does that?" So they look it up in whatever
resources are available, and maybe they even find that instruction. Off goes
the student, happily using the instruction without bothering to learn how the
instruction actually works or what the cost of using that instruction might
be. The really good programmers will take the time to learn everything that
can be learned about the instruction, but the lesser hacks will be satisfied
with learning the basic functionality and stop well short of mastery. As
educators, we try to encourage mastery, and in the early going that means
forcing you to use certain operations and ignore others until you really
understand how it's all working.
This sort of thing happens frequently with "append". Often, new students will
view append as something to be used to add something to the back end of a
list, in the same way that cons adds something to the front end of a list.
And while there's a little bit of truth to that assumption, the assumption is
much more wrong than it is right. Because of the way lists are constructed in
Scheme, as we have just seen, it turns out that building a list by adding
things to the back end is much less efficient than building that same list by
adding things to the front and then reversing the list one time. Make sure
you understand the cost of using append before you start using it.
"But wait," said the student in question, "the professor said we don't have
to worry about efficiency." Yes, but if you've been attending class regularly,
staying awake, or at least keeping up with reading the lecture notes, you
know that what we've really said all along is that while we don't worry about
little efficiency issues, like saving one operation at the end of a bunch of
operations, we do worry about big efficiency issues, like whether a program
will take a minute, an hour, a week, a year, or a century or more to finish.
It's that latter time frame (the "or more" time frame) that applied in the
example above.
So, the moral to this story is that simple little decisions like using append
instead of cons can cost you lots of time and lots of memory. In this case,
the difference appears to be linear growth in memory usage versus quadratic
(i.e., the variable N is raised to the constant power 2) growth in memory
usage. And it looks like the differences in growth of the amount of time
needed to complete the computation is linear versus quadratic too. So while
we don't want you to sacrifice readability of your code to save a few bytes
here or a few cycles there, on the other hand we DO want you to worry about
linear versus quadratic (or worse) differences...these are the kinds of
differences that determine whether your program will finish in a few seconds
or whether it will still be running when the Sun burns out. Don't be a
brain-dead programmer, regardless of the language you're using or the problem
you're tyring to solve.
Things to learn from this example:
1. cons and append do not do the same thing to different
ends of a list
2. Tail recursion does not solve all problems
3. In any given programming language, operations
that look like they might be similar in nature are
often very different. Know your language.
4. Some inefficiencies cost relatively little and aren't
really worth worrying about. Other inefficiencies cost
more than you might imagine; take the time to analyze
the behavior of the procedures you create before you
unleash them on an unsuspecting world.
5. All the arguments about which language is faster,
which operating system is faster, and which chip is
faster are moot if the programmer is a moron. Almost
always, performance problems arise and programming
projects fail not because the wrong language or
computer or operating system was used... projects fail
because many of the folks out there who create
computer programs for a living are idiots.
6. Don't be an idiot.
IX. Tree recursion
As long as we've spent so much time on the problem of how to remove things
from a list, let's not waste all that effort. Just for fun (yeah, sure) let's
think about the problem of removing not just the top-level occurrences of
some item from a list, but ALL occurrences of that item from a list, no
matter how deeply nested those items might be in that list. So instead of
this:
> (remove 'x '(a x a (x) (a (x b))))
(a a (x) (a (x b)))
we'd get this
(a a () (a (b)))
Here's the original remove, in its augmenting recursion form:
(define (remove item input-list)
(cond ((null? input-list) ())
((equal? item (car input-list))
(remove item (cdr input-list)))
(else (cons (car input-list)
(remove item (cdr input-list))))))
Think for a minute about what this version of remove does: it scoots along a
flat list, comparing elements of the list to the item to be removed. The new
remove needs to do exactly the same thing, except that if the element that's
coming from the list being examined is itself a list (that is, if the car of
the input-list is a list), and that list doesn't match the target item to be
removed, then we need to try to remove the target item from that list. How
do we do it?
First, let's give this thing a new name...remove-all, because it goes
everywhere in the list structure and removes all occurrences of the item,
instead of hovering at the top level of the list.
(define (remove-all item input-list)
There. That first part is done. Now what? We need to test to see if we've
looked at all of input-list. How? Just like in the old version:
(cond ((null? input-list) ())
If we're not looking at the empty list, then there's the possibility that I
want to remove something from the empty list. How? Again, just like in the
old version. I test to see if item looks like the first thing in the
input-list. Note that item itself could be a list, and if it's a list that
looks like the first thing on the input-list, then I remove it. And I remove
something from the input-list by NOT consing that thing onto the copy of the
list that I'm building as I go:
((equal? item (car input-list))
(remove-all item (cdr input-list)))
Now things start to get different from the old version of remove-all. So far,
I've tested to see if I'm done by looking to see if I have an empty list. If
that's not the case, then I test to see if the first thing on the input-list
matches the item I want to remove from the input-list. But what if that's not
the case? Well, there are now two different possibilities I need to consider.
One possibility is that the first thing on the input-list is itself a list
(one that didn't happen to look like whatever was passed as item). If so,
I'll want to remove all occurrences of item from that list, which is the car
of the input-list, AND I'll want to remove all occurrences of item from the
rest or cdr of the input-list. Oh, I'll also want to join the results of
those two calls to remove-all together somehow. How? Like this:
((list? (car input-list))
(cons (remove-all item (car input-list))
(remove-all item (cdr input-list))))
All I've done here is call remove-all twice---once on the car of the
input-list, and once on the cdr of the input-list. It may look a little
weird because you haven't seen it before, but it's just plain old recursion.
No big deal. And why did I use cons to join the results together? Because
cons is what we use to put together a list that we've taken apart with car
and cdr, assuming we want the result to have the same structure as it did
when we started, except for the things being removed of course.
What's the other condition that we have to worry about? That's where the
first thing on the input-list isn't a list and it doesn't match the input
item (and we're not looking at the empty list). What do we do then? Just
exactly what we did with the original remove-all...we cons that first thing
onto the result of calling remove-all (or remove-all in this case) on the
rest of the input-list:
(else (cons (car input-list)
(remove-all item (cdr input-list))))))
Here's the whole thing in one piece:
(define (remove-all item input-list)
(cond ((null? input-list) ())
((equal? item (car input-list))
(remove-all item (cdr input-list)))
((list? (car input-list))
(cons (remove-all item (car input-list))
(remove-all item (cdr input-list))))
(else (cons (car input-list)
(remove-all item (cdr input-list))))))
The type of recursion that's going on here has several names. It can be
called "multiple recursion" or "tree recursion", or it can be called
"car/cdr recursion", which is a special kind of multiple or tree recursion.
"Multiple recursion" is an appropriate name because, from the point of view
of our substitution model of evaluation, each call to "remove-all" is
substituted on the program stack by multiple calls to "remove-all".
Stack usage could be fairly huge, no? I'll let you trace it on your own.
"Tree recursion" is an appropriate name if we think about how the original
problem is broken down: we start by trying to remove something from
everywhere in the original list; that problem becomes two problems of trying
to remove from two parts of that list; each of those problems spins off two
more problems, and so on. If you were to sit down and draw a little graph of
how the original problem was divided into two subproblems, and each of those
was further divided, and on and on, you'd get yourself a nice little
tree-like structure. What's a tree? It's that thing in the yard with the
leaves. But it's also a data structure that you may not know about yet.
You will.
Finally, "car/cdr recursion" is appropriate in this particular use of tree
recursion because the initial problem is broken down into recursive calls of
"remove-all" on the "car" of the list and the "cdr" of the list.
If you test it out, sure enough, it works. Sometimes, when you know you
need to do tree recursion to solve some problem but can't figure out how,
solve a simpler problem. That is, solve the problem for a flat structure like
a simple list, then extend the solution to a more hierarchical or nested
structure (i.e., a tree...like we said, you'll see more about these data
structures soon...for now, just remember that we said this here), and
then add the tree (or multiple or car/cdr) recursion.
The important thing to remember here is that multiple or tree recursion is a
very standard way to travel around really complex data structures, or to
solve some problems with really complex decompositions. We'll be seeing this
kind of recursion pop up in lots of different places.
And remember...you can never get enough practice at this stuff. Even if
you're very comfortable with what we've done recursively so far, this might
be a little confusing for you. So don't have a meltdown...just practice more.
Speaking of more practice, there's plenty coming on your next homework
assignment.
Copyright (c) 2003 by Kurt Eiselt. All rights reserved, with
the exception of stuff that belongs to somebody else.
Last revised: September 18, 2002