The big transition
Up to now, we've followed a very pure functional approach to
programming:
no assignment
no iteration (of the traditional count and loop kind)
no side effects
reliance on referential transparency (i.e., the result of
a function is dependent only on the parameters passed
to it)
But hey, it sure would be nice to be able to have a function
whose behavior is dependent on its history---that is, a
function that has a sense of state.
As we've seen, things in the real world have state. For
example, if we want our Alquerque programs to learn so as to
play better next time, we'd like to be able to save stuff
that doesn't go away when the program ends. Or if we want to
model physical systems like pizza shops, hotels, cafeterias,
gas stations, automated tellers, coke machines, operating
systems, or even robots, all of which exhibit varying behavior
over time as a result of their previous history, we want to be
able to retain something about history or state.
But purely functional programming severely hampers our
ability to do this. What we're going to need to make this
whole state thing work are variables and assignment.
We've seen variables in LISP already, although we haven't
often referred to them as such. Those are the symbols in
the argument list of a function, but we don't currently have
much control over them. Today we'll learn how to create
them, control them, and use them.
Variables and scoping in LISP
The variables created in the argument list of a function
definition are, as you may know, local to the function, and
they are bound to values when the function is called. Those
values are accessible during the execution/evaluation of the
body of the function, but they cannot be accessed after the
function evaluation has ceased. Furthermore, those values
cannot be accessed or altered by functions which are called
by the current function.
When the bindings of variables in the argument list are
accessible only within the body of the function, this is
called "lexical scoping", and the variables are said to be
"lexically scoped". (This is also called "static scoping".)
Here's an example:
(defun add-3 (number)
(+ 3 number))
It's a silly example, but it works. Compare this to:
(defun add-3 (number)
(add-a-constant 3))
(defun add-a-constant (const)
(+ number const))
This example doesn't work. Why? Because "number" is
accessible in "add-3" but not "add-a-constant", and that's
a result of lexical scoping. If these variables were
"dynamically scoped", the latter example would work just
fine. Under dynamic scoping, the variables in the argument
list are bound when the function is called, and the values
are accessible by this function and any function called by
this function (and so on) until the evaluation of this
function has ceased. In other words, the variable values are
accessible to any functions called by this function even
though the values aren't passed to the called functions as
parameters.
You can think of a dynamically scoped variable as being local
to the function in which that variable is first bound, but
global to all functions called by that first function (and
all functions called by those functions, and so on...). But
don't confuse a dynamically scoped variable with a global
variable...there are limitations on access to dynamically
scoped variables, but there are no limitations to accessing
global variables.
Dynamic scoping was employed in most earlier dialects of
LISP, but lexical scoping is the default in Common LISP.
Lexical scoping is preferred over dynamic scoping for two
reasons:
1. When you're thinking about when or where variable
bindings can be accessed, it's easier to think in terms of
"they're accessible where the variable names can be seen in
the body of the function" instead of "they're accessible in
this function, even if you don't see the variable names, so
long as this function was called by another function in which
these variables were accessible." (Whew!) You'd like to be
able to understand and debug a function based on what you
know just by looking at the function itself. You don't want
to have to know where you are in the overall control flow of
a group of interdependent functions to know how this
particular function is going to work at this point in time.
Again, it's an issue of controlling complexity.
2. Under lexical scoping, you can use a variable name
locally in one function, and you (or someone else) can use
the same name locally in another function, and neither will
clobber the other. Lexical scoping avoids variable name
conflicts. Under dynamic scoping, there is a possibility
that this undesirable situation could arise and, depending on
who calls what and when, one value of a variable could be
inadvertently wiped out by another. This can get pretty
nasty in a communal programming environment; be sure to
practice safe programming. (Using packages avoids this
problem too; you learned about packages in tonight's
lab, if not last week's lab, I hope.)
We can get dynamic scoping in Common LISP if we really want
to. We do this by declaring the variable or variables
in question as "special":
(defun add-3 (number)
(declare (special number))
(add-a-constant 3))
(defun add-a-constant (const)
(declare (special number))
(+ number const))
The first declaration tells Common LISP "set up the variable
'number' as a special dynamically-scoped variable when you
evaluate 'add-3'". The second declaration tells Common LISP
that "'number' is a dynamically-scoped variable---it's not
created in the 'normal' way in this function, but go ahead and
access its value in this function". Any function that uses a
dynamically-scoped variable has to also contain the special
declaration.
So now if LISP has evaluated the two function definitions above for
"add-3" and "add-a-constant", we should get predictable behavior
when we test it out. So we try to add 3 to the value 7 like this:
? (add-3 7)
10
and sure enough it works. The value 7 wasn't explicitly passed
between the two functions, but making "number" a dynamically-
scoped variable did the trick.
Now, does the value bound to "number" persist after "add-3"
and "add-a-constant" are done? We didn't do this in class
today, but just so you know what's up, let's make another function
and see what happens. I have the same two functions as before,
plus a third one in which I also make "number" a dynamically-
scoped variable.
(defun add-3 (number)
(declare (special number))
(add-a-constant 3))
(defun add-a-constant (const)
(declare (special number))
(+ number const))
(defun just-add-4 ()
(declare (special number))
(+ number 4))
So I try to add 3 to 6, expecting to see 9 returned:
? (add-3 6)
9
That works, just like before. But now if I count on "number"
still being bound to 6 and try to add 4 to it, expecting to see 10
returned, it blows up:
? (just-add-4)
> Error: Unbound variable: NUMBER .
> While executing: JUST-ADD-4
So the value of "number" does not persist.
In any case, adding dynamic scoping to your LISP code may be something
you'll want to do in the distant future, but you don't really need
it now unless you really goober up your Alquerque program and you
think that dynamic scoping will make things better...yeah, right.
Introducing more local variables
Sometimes you'd like to have more local variables in your
functions than just what's available through the argument
list. One way to introduce more variables is to use a
"helping function" and add additional variables in the
argument list to the helping function. But you already knew
that.
Another way to introduce additional local variables is to use
LISP's "let" function:
(let ((variable1 value1)
(variable2 value2)
:
:
(variablen valuen))
)
The "let" form allows you to bind a bunch of values to
variables, and makes all those variables local to the code
embedded within the "let" form. We'll see an example of how
to use it later on.
Assignment
We talked a little bit about assignment early on in this course.
Let's do a little refresher, and then let's get dangerous.
The fundamental method of assignment in LISP is the "set" function:
(set X 3)
But this doesn't do what you probably expect it to do,
because like most good LISP functions, it evaluates its
arguments. If X is bound to another symbol, like "A", the
net result would be the equivalent of A := 3. If X is not
bound, however, LISP will give you an error message.
What you usually want to do is this:
(set (quote X) 3)
which is the equivalent of X := 3.
This is needed so often that there's a shorthand version of
it, called "setq", which is short for "set quote":
? (setq X 3) ;; setq binds a value to a symbol
3
? X
3
There's another function that does the same thing, plus a bit
more. It's called "setf":
? (setf X 3)
3
? X
3
SETF and the generalized variable
We tend to think of a variable as a place where you store a
data object. In reality, a variable in LISP is a place where
you store a pointer to a data object. A "generalized
variable" in Common LISP is any expression that allows us to
access a particular place in memory. So, for example, when
LISP evaluates:
(setf X '(A B C D))
the resulting construct in memory looks like this:
X
|
|
| _______ _______ _______ _______
+-->| | | | | | | | | | | /|
| | | --+---->| | | --+---->| | | --+---->| | | / |
|_|_|___| |_|_|___| |_|_|___| |_|_|/__|
| | | |
A B C D
Assignment in LISP is nothing more than replacing one pointer
with another (or initializing a pointer). So X designates a
pointer, and (nth 2 X) designates a different pointer:
X (nth 2 X)
| |
| |
| _______ _______ | _______ _______
+-->| | | | | | | | | | | | /|
| | | --+---->| | | --+---->| | | --+---->| | | / |
|_|_|___| |_|_|___| | |_|_|___| |_|_|/__|
| | | | |
A B +--> C D
Both X and (nth 2 X) are place descriptions; they're both
generalized variables. Consequently, if we have LISP
evaluate the expression:
? (setf (nth 2 X) 'Q)
the list above will be changed to:
X (nth 2 X)
| |
| |
| _______ _______ | _______ _______
+-->| | | | | | | | | | | | /|
| | | --+---->| | | --+---->| | | --+---->| | | / |
|_|_|___| |_|_|___| | |_|_|___| |_|_|/__|
| | | | |
A B +--> Q D
and if we type the following at our LISP evaluator:
? X
we'll see LISP return:
(A B Q D)
If we tried this with the "setq" function, it wouldn't work,
and that's the difference between "setq" and "setf". The
"setf" function goes into the actual place in memory that the
pointer is pointing to and replaces the value there. This is
what's known as "destructive modification", and it's not
usually what you want to do. Why? It's dangerous,
especially in a communal programming environment.
Say for example that you have a system in which information
is shared between functions via global variables. There are
times when this is the right thing to do. To indicate
globals in LISP, we typically use a function called "defvar",
and we put asterisks around the variable name just to make it
stand out:
? (defvar *X* '(A B C D))
*X*
?*X*
(A B C D)
But then you might set up another global variable like this:
? (defvar *Y* (rest (rest *X*)))
*Y*
? *Y*
What happens if some other unsuspecting soul comes by and,
thinking that *X* and *Y* are independent, decides to change
*Y*?
? (setf (first *Y*) 'Q)
Q
? *Y*
(Q D)
This looks like what we wanted to happen, but now when we go
back and look at X, we find that it's changed, and we may not
have wanted that:
? *X*
(A B Q D)
In situtations like this, it would be safer to make a copy of
the data object before messing with it. LISP has functions
called "copy-list" and "copy-tree" for doing exactly this, so
make sure you look them up in your book.
The obvious next question is, if destructive modification is
inherently unsafe, especially in communal programming
environments, why do it at all? The answer is that copying
data structures, especially if they're big, eats up space and
time. While one of the really nice things about a dynamic
programming language like LISP is that you typically don't
worry about memory management, the fact is that the more
copying of data structures that you do, the more memory gets
used up. That'll get in your way sometimes. Furthermore, as
that same memory becomes unused, the space will have to be
recovered by a "garbage collector" and returned to "available
memory". When the garbage collector is running, that also
eats up time, and that's not desirable either.
So, people will use destructive modifications for the sake of
efficiency, and when those situations occur where saving
computational resources is critical, you should think about
using destructive modifications too. But of course, you
won't encounter these situations in CS 2360, ok?
Other destructive assignment functions that you may see,
especially if you look at LISP code in dialects that came
before Common LISP, are "rplaca" and "rplacd". The former is
shorthand for "replace car" and is equivalent to:
(setf (car ...) ...) or (setf (first ...) ...)
The latter is shorthand for "replace cdr" and is equivalent
to:
(setf (cdr ...) ...) or (setf (rest ...) ...)
Hence, "setf" in Common LISP combines the power of "setq",
"rplaca", and "rplacd".
The cost of assignment
On a less pragmatic note, but from a more
computational/theoretical perspective, assignment carries an
additional cost. Under the substitution model of evaluation,
a variable name was bound to its value at the time the
function was entered, and it didn't change during the
execution of the function. Life was simple.
By introducing assignment, we've disrupted our simple
substitution model of evaluation. The values of variables depend
on where you are in the control flow within the procedure.
Debugging is more complicated. But more importantly, your simple
substitution model of evaluation is no longer powerful enough to
describe what's going on. You need a more powerful, and more
complex, and no longer mathematically nice, evaluation model, which
we won't talk about here.
Furthermore, by introducing assignment, you've given up your
immunity to the siren song of global variables. Once you've
crossed the line and decided to abandon the functional programming
paradigm just a little bit, it doesn't take much to get you
to abandon it just a little bit more and drop in some global
variables when you just can't figure out a more elegant way to
pass information between procedures. And then you lose your
"referential transparency"...you no longer know that the procedure
you're looking at is affected only by the information explicitly
passed to it through the argument list. That's not conducive
to a happy debugging experience.
By introducing assignment, you may be improving efficiency in
terms of cycles used, space used, and so on. You may also be
improving the "aesthetic complexity"---your program may look
less complicated (but that's not guaranteed, not by any stretch
of the imagination). But you're also demonstrably increasing
computational complexity. That makes your life harder if you
happen to be a compiler or interpreter writer, but that's no
big deal, as we don't mind making computers do work. What is
a big deal is that assignment makes it harder to test, debug,
or validate your software. A procedure in which variable
bindings don't change has to be easier to understand than one
in which the bindings do change, no? And when you're working
with really big systems where big dollars or real lives
are on the line, you want to put more value on reducing complexity
than on improving efficiency. Massive software failures don't
happen because programs are too slow or they're too big...they
happen because the programs are too complex to be checked out
thoroughly.
Does this mean you shouldn't use assignment? No. What it
means is that you should realize that you're paying a price
to do so.
Lecture notes by Kurt Eiselt, 1998.
Minor changes / additions by Brian McNamara, 1998.
Last updated on
Thu Aug 13 02:19:29 EDT 1998
by Brian McNamara