The big transition

Up to now, we've followed a very pure functional approach to 
programming:

  no assignment
  no iteration (of the traditional count and loop kind)
  no side effects
  reliance on referential transparency (i.e., the result of
    a function is dependent only on the parameters passed
    to it)

But hey, it sure would be nice to be able to have a function 
whose behavior is dependent on its history---that is, a 
function that has a sense of state.

As we've seen, things in the real world have state.  For 
example, if we want our Alquerque programs to learn so as to 
play better next time, we'd like to be able to save stuff 
that doesn't go away when the program ends.  Or if we want to 
model physical systems like pizza shops, hotels, cafeterias, 
gas stations, automated tellers, coke machines, operating
systems, or even robots, all of which exhibit varying behavior 
over time as a result of their previous history, we want to be 
able to retain something about history or state.

But purely functional programming severely hampers our 
ability to do this.  What we're going to need to make this 
whole state thing work are variables and assignment.

We've seen variables in LISP already, although we haven't 
often referred to them as such.  Those are the symbols in 
the argument list of a function, but we don't currently have 
much control over them.  Today we'll learn how to create 
them, control them, and use them.


Variables and scoping in LISP

The variables created in the argument list of a function 
definition are, as you may know, local to the function, and 
they are bound to values when the function is called.  Those 
values are accessible during the execution/evaluation of the 
body of the function, but they cannot be accessed after the 
function evaluation has ceased.  Furthermore, those values 
cannot be accessed or altered by functions which are called 
by the current function.

When the bindings of variables in the argument list are 
accessible only within the body of the function, this is 
called "lexical scoping", and the variables are said to be 
"lexically scoped".  (This is also called "static scoping".)
Here's an example:

(defun add-3 (number)
  (+ 3 number))

It's a silly example, but it works.  Compare this to:

(defun add-3 (number)
  (add-a-constant 3))

(defun add-a-constant (const)
  (+ number const))

This example doesn't work.  Why?  Because "number" is 
accessible in "add-3" but not "add-a-constant", and that's 
a result of lexical scoping.  If these variables were 
"dynamically scoped", the latter example would work just 
fine.  Under dynamic scoping, the variables in the argument 
list are bound when the function is called, and the values 
are accessible by this function and any function called by 
this function (and so on) until the evaluation of this 
function has ceased.  In other words, the variable values are 
accessible to any functions called by this function even 
though the values aren't passed to the called functions as 
parameters.

You can think of a dynamically scoped variable as being local 
to the function in which that variable is first bound, but 
global to all functions called by that first function (and 
all functions called by those functions, and so on...).  But
don't confuse a dynamically scoped variable with a global
variable...there are limitations on access to dynamically
scoped variables, but there are no limitations to accessing
global variables.

Dynamic scoping was employed in most earlier dialects of 
LISP, but lexical scoping is the default in Common LISP.  
Lexical scoping is preferred over dynamic scoping for two 
reasons:

1.  When you're thinking about when or where variable 
bindings can be accessed, it's easier to think in terms of 
"they're accessible where the variable names can be seen in 
the body of the function" instead of "they're accessible in 
this function, even if you don't see the variable names, so 
long as this function was called by another function in which 
these variables were accessible."  (Whew!)  You'd like to be 
able to understand and debug a function based on what you 
know just by looking at the function itself.  You don't want 
to have to know where you are in the overall control flow of 
a group of interdependent functions to know how this 
particular function is going to work at this point in time.  
Again, it's an issue of controlling complexity.

2.  Under lexical scoping, you can use a variable name 
locally in one function, and you (or someone else) can use 
the same name locally in another function, and neither will 
clobber the other.  Lexical scoping avoids variable name 
conflicts.  Under dynamic scoping, there is a possibility 
that this undesirable situation could arise and, depending on 
who calls what and when, one value of a variable could be 
inadvertently wiped out by another.  This can get pretty 
nasty in a communal programming environment; be sure to 
practice safe programming.  (Using packages avoids this
problem too; you learned about packages in tonight's 
lab, if not last week's lab, I hope.)

We can get dynamic scoping in Common LISP if we really want 
to.  We do this by declaring the variable or variables
in question as "special":

(defun add-3 (number)
  (declare (special number))
  (add-a-constant 3))

(defun add-a-constant (const)
  (declare (special number))
  (+ number const))

The first declaration tells Common LISP "set up the variable 
'number' as a special dynamically-scoped variable when you 
evaluate 'add-3'".  The second declaration tells Common LISP 
that "'number' is a dynamically-scoped variable---it's not 
created in the 'normal' way in this function, but go ahead and 
access its value in this function".  Any function that uses a 
dynamically-scoped variable has to also contain the special 
declaration.

So now if LISP has evaluated the two function definitions above for
"add-3" and "add-a-constant", we should get predictable behavior
when we test it out.  So we try to add 3 to the value 7 like this:

? (add-3 7)
10

and sure enough it works.  The value 7 wasn't explicitly passed
between the two functions, but making "number" a dynamically-
scoped variable did the trick.  

Now, does the value bound to "number" persist after "add-3"
and "add-a-constant" are done?  We didn't do this in class
today, but just so you know what's up, let's make another function 
and see what happens.  I have the same two functions as before,
plus a third one in which I also make "number" a dynamically-
scoped variable.

(defun add-3 (number)
  (declare (special number))
  (add-a-constant 3))

(defun add-a-constant (const)
  (declare (special number))
  (+ number const))

(defun just-add-4 ()
  (declare (special number))
  (+ number 4))

So I try to add 3 to 6, expecting to see 9 returned:

? (add-3 6)
9

That works, just like before.  But now if I count on "number" 
still being bound to 6 and try to add 4 to it, expecting to see 10
returned, it blows up:

? (just-add-4)
> Error: Unbound variable: NUMBER .
> While executing: JUST-ADD-4

So the value of "number" does not persist.

In any case, adding dynamic scoping to your LISP code may be something
you'll want to do in the distant future, but you don't really need
it now unless you really goober up your Alquerque program and you
think that dynamic scoping will make things better...yeah, right.


Introducing more local variables

Sometimes you'd like to have more local variables in your 
functions than just what's available through the argument 
list.  One way to introduce more variables is to use a 
"helping function" and add additional variables in the 
argument list to the helping function.  But you already knew 
that.  

Another way to introduce additional local variables is to use 
LISP's "let" function:

(let ((variable1 value1)
      (variable2 value2)
           :
           :
      (variablen valuen))

      

                         )

The "let" form allows you to bind a bunch of values to 
variables, and makes all those variables local to the code 
embedded within the "let" form.  We'll see an example of how 
to use it later on.


Assignment

We talked a little bit about assignment early on in this course.
Let's do a little refresher, and then let's get dangerous.

The fundamental method of assignment in LISP is the "set" function:

(set X 3)

But this doesn't do what you probably expect it to do, 
because like most good LISP functions, it evaluates its 
arguments.  If X is bound to another symbol, like "A", the 
net result would be the equivalent of A := 3.  If X is not 
bound, however, LISP will give you an error message.

What you usually want to do is this:

(set (quote X) 3)

which is the equivalent of X := 3.

This is needed so often that there's a shorthand version of 
it, called "setq", which is short for "set quote":

? (setq X 3)  ;; setq binds a value to a symbol
3
? X
3

There's another function that does the same thing, plus a bit 
more.  It's called "setf":

? (setf X 3)
3
? X
3

SETF and the generalized variable

We tend to think of a variable as a place where you store a 
data object.  In reality, a variable in LISP is a place where 
you store a pointer to a data object.  A "generalized 
variable" in Common LISP is any expression that allows us to 
access a particular place in memory.  So, for example, when 
LISP evaluates:

(setf X '(A B C D))

the resulting construct in memory looks like this:

X  
|
|
|    _______       _______       _______       _______
+-->|   |   |     |   |   |     |   |   |     |   |  /|     
    | | | --+---->| | | --+---->| | | --+---->| | | / |     
    |_|_|___|     |_|_|___|     |_|_|___|     |_|_|/__|     
      |             |             |             |
      A             B             C             D

Assignment in LISP is nothing more than replacing one pointer 
with another (or initializing a pointer).  So X designates a 
pointer, and (nth 2 X) designates a different pointer:

X                          (nth 2 X)
|                            |
|                            |
|    _______       _______   |   _______       _______
+-->|   |   |     |   |   |  |  |   |   |     |   |  /|     
    | | | --+---->| | | --+---->| | | --+---->| | | / |     
    |_|_|___|     |_|_|___|  |  |_|_|___|     |_|_|/__|     
      |             |        |    |             |
      A             B        +--> C             D

Both X and (nth 2 X) are place descriptions; they're both 
generalized variables.  Consequently, if we have LISP 
evaluate the expression:

? (setf (nth 2 X) 'Q)

the list above will be changed to:

X                          (nth 2 X)
|                            |
|                            |
|    _______       _______   |   _______       _______
+-->|   |   |     |   |   |  |  |   |   |     |   |  /|     
    | | | --+---->| | | --+---->| | | --+---->| | | / |     
    |_|_|___|     |_|_|___|  |  |_|_|___|     |_|_|/__|     
      |             |        |    |             |
      A             B        +--> Q             D

and if we type the following at our LISP evaluator:

? X

we'll see LISP return:

(A B Q D)

If we tried this with the "setq" function, it wouldn't work, 
and that's the difference between "setq" and "setf".  The 
"setf" function goes into the actual place in memory that the 
pointer is pointing to and replaces the value there.  This is 
what's known as "destructive modification", and it's not 
usually what you want to do.  Why?  It's dangerous, 
especially in a communal programming environment.

Say for example that you have a system in which information 
is shared between functions via global variables.  There are 
times when this is the right thing to do.  To indicate 
globals in LISP, we typically use a function called "defvar", 
and we put asterisks around the variable name just to make it 
stand out:

? (defvar *X* '(A B C D))
*X*
?*X*
(A B C D)

But then you might set up another global variable like this:

? (defvar *Y* (rest (rest *X*)))
*Y*
? *Y*

What happens if some other unsuspecting soul comes by and, 
thinking that *X* and *Y* are independent, decides to change 
*Y*?

? (setf (first *Y*) 'Q)
Q
? *Y*
(Q D)

This looks like what we wanted to happen, but now when we go 
back and look at X, we find that it's changed, and we may not 
have wanted that:

? *X*
(A B Q D)

In situtations like this, it would be safer to make a copy of 
the data object before messing with it.  LISP has functions 
called "copy-list" and "copy-tree" for doing exactly this, so 
make sure you look them up in your book.

The obvious next question is, if destructive modification is 
inherently unsafe, especially in communal programming 
environments, why do it at all?  The answer is that copying 
data structures, especially if they're big, eats up space and 
time.  While one of the really nice things about a dynamic 
programming language like LISP is that you typically don't 
worry about memory management, the fact is that the more 
copying of data structures that you do, the more memory gets 
used up.  That'll get in your way sometimes.  Furthermore, as 
that same memory becomes unused, the space will have to be 
recovered by a "garbage collector" and returned to "available 
memory".  When the garbage collector is running, that also 
eats up time, and that's not desirable either.

So, people will use destructive modifications for the sake of 
efficiency, and when those situations occur where saving 
computational resources is critical, you should think about 
using destructive modifications too.  But of course, you 
won't encounter these situations in CS 2360, ok?

Other destructive assignment functions that you may see, 
especially if you look at LISP code in dialects that came 
before Common LISP, are "rplaca" and "rplacd".  The former is 
shorthand for "replace car" and is equivalent to:

(setf (car ...) ...) or (setf (first ...) ...)

The latter is shorthand for "replace cdr" and is equivalent 
to:

(setf (cdr ...) ...) or (setf (rest ...) ...)

Hence, "setf" in Common LISP combines the power of "setq", 
"rplaca", and "rplacd".


The cost of assignment

On a less pragmatic note, but from a more 
computational/theoretical perspective, assignment carries an 
additional cost.  Under the substitution model of evaluation, 
a variable name was bound to its value at the time the 
function was entered, and it didn't change during the 
execution of the function.  Life was simple.

By introducing assignment, we've disrupted our simple 
substitution model of evaluation.  The values of variables depend 
on where you are in the control flow within the procedure.  
Debugging is more complicated.  But more importantly, your simple 
substitution model of evaluation is no longer powerful enough to 
describe what's going on.  You need a more powerful, and more 
complex, and no longer mathematically nice, evaluation model, which 
we won't talk about here.

Furthermore, by introducing assignment, you've given up your 
immunity to the siren song of global variables.  Once you've
crossed the line and decided to abandon the functional programming
paradigm just a little bit, it doesn't take much to get you
to abandon it just a little bit more and drop in some global
variables when you just can't figure out a more elegant way to
pass information between procedures.  And then you lose your
"referential transparency"...you no longer know that the procedure
you're looking at is affected only by the information explicitly
passed to it through the argument list.  That's not conducive
to a happy debugging experience.

By introducing assignment, you may be improving efficiency in 
terms of cycles used, space used, and so on.  You may also be 
improving the "aesthetic complexity"---your program may look 
less complicated (but that's not guaranteed, not by any stretch
of the imagination).  But you're also demonstrably increasing 
computational complexity.  That makes your life harder if you 
happen to be a compiler or interpreter writer, but that's no 
big deal, as we don't mind making computers do work.  What is 
a big deal is that assignment makes it harder to test, debug, 
or validate your software.  A procedure in which variable 
bindings don't change has to be easier to understand than one 
in which the bindings do change, no?  And when you're working 
with really big systems where big dollars or real lives
are on the line, you want to put more value on reducing complexity 
than on improving efficiency.  Massive software failures don't 
happen because programs are too slow or they're too big...they 
happen because the programs are too complex to be checked out 
thoroughly.

Does this mean you shouldn't use assignment?  No.  What it 
means is that you should realize that you're paying a price 
to do so.


Lecture notes by Kurt Eiselt, 1998.
Minor changes / additions by Brian McNamara, 1998.
Last updated on Thu Aug 13 02:19:29 EDT 1998 by Brian McNamara