CS 2360 - May 7, 1996

Lecture 13 -- Using Side Effects


The big transition

Up to now, we've followed a very pure functional approach to 
programming:

  no assignment
  no iteration (of the traditional count and loop kind)
  no side effects
  reliance on referential transparency (i.e., the result of
    a function is dependent only on the parameters passed
    to it)

But hey, it sure would be nice to be able to have a function 
whose behavior is dependent on its history---that is, a 
function that has a sense of state.

As we've seen, things in the real world have state.  For 
example, if we wanted our hexapawn programs to learn so as to 
play better next time, we'd like to be able to save stuff 
that doesn't go away when the program ends.  Or if we want to 
model physical systems like pizza shops, hotels, cafeterias, 
gas stations, automated tellers, coke machines, operating
systems, or even robots, all of which exhibit varying behavior 
over time as a result of their previous history, we want to be 
able to retain something about history or state.

But purely functional programming severely hampers our 
ability to do this.  What we're going to need to make this 
whole state thing work are variables and assignment.

We've seen variables in LISP already, although we haven't 
really referred to them as such.  Those are the symbols in 
the argument list of a function, but we don't currently have 
much control over them.  Today we'll learn how to create 
them, control them, and use them.

Variables and scoping in LISP

The variables created in the argument list of a function 
definition are, as you know, local to the function, and they 
are bound to values when the function is called.  Those 
values are accessible during the execution/evaluation of the 
body of the function, but they cannot be accessed after the 
function evaluation has ceased.  Furthermore, those values 
cannot be accessed or altered by functions which are called 
by the current function.

When the bindings of variables in the argument list are 
accessible only within the body of the function, this is 
called "lexical scoping", and the variables are said to be 
"lexically scoped".  (This is also called "static scoping".)
Here's an example:

(defun my-first (my-list)
  (car my-list))

It's a silly example, but it works.  Compare this to:

(defun my-first (my-list)
  (my-car))

(defun my-car ()
  (car my-list))

This example doesn't work.  Why?  Because "my-list" is 
accessible in "my-first" but not "my-car", and that's a 
result of lexical scoping.  If these variables were 
"dynamically scoped", the second example would work just 
fine.  Under dynamic scoping, the variables in the argument 
list are bound when the function is called, and the values 
are accessible by this function and any function called by 
this function (and so on) until the evaluation of this 
function has ceased.  In other words, the variable values are 
accessible to any functions called by this function even 
though the values aren't passed to the called functions as 
parameters.

You can think of a dynamically scoped variable as being local 
to the function in which that variable is first bound, but 
global to all functions called by that first function (and 
all functions called by those functions, and so on...).

Dynamic scoping was employed in most earlier dialects of 
LISP, but lexical scoping is the default in Common LISP.  
Lexical scoping is preferred over dynamic scoping for two 
reasons:

1.  When you're thinking about when or where variable 
bindings can be accessed, it's easier to think in terms of 
"they're accessible where the variable names can be seen in 
the body of the function" instead of "they're accessible in 
this function, even if you don't see the variable names, so 
long as this function was called by another function in which 
these variables were accessible."  (Whew!)  You'd like to be 
able to understand and debug a function based on what you 
know just by looking at the function itself.  You don't want 
to have to know where you are in the overall control flow of 
a group of interdependent functions to know how this 
particular function is going to work at this point in time.  
Again, it's an issue of controlling complexity.

2.  Under lexical scoping, you can use a variable name 
locally in one function, and you (or someone else) can use 
the same name locally in another function, and neither will 
clobber the other.  Lexical scoping avoids variable name 
conflicts.  Under dynamic scoping, there is a possibility 
that this undesirable situation could arise and, depending on 
who calls what and when, one value of a variable could be 
inadvertently wiped out by another.  This can get pretty 
nasty in a communal programming environment; be sure to 
practice safe programming.

We can get dynamic scoping in Common LISP if we really want 
to.  We do this by declaring the variables as "special":

(defun my-first (list)
  (declare (special list))
  (my-car ()))

(defun my-car ()
  (declare (special list))
  (car list))

The first declaration tells Common LISP "set up this variable 
as a special dynamically-scoped variable when you execute my-
first".  The second declaration tells Common LISP that 
"'list' is a dynamically-scoped variable---it's not created 
in the 'normal' way in this function, but go ahead and access 
its value in this function".  Any function that uses a 
dynamically-scoped variable has to also contain the special 
declaration.

Assignment

Now that we have variables, let's see how to do stuff to them 
using assignment.  The fundamental method of assignment in 
LISP is the "set" function:

(set X 3)

But this doesn't do what you probably expect it to do, 
because like most good LISP functions, it evaluates its 
arguments.  If X is bound to another symbol, like "A", the 
net result would be the equivalent of A {- 3.  If X is not 
bound, however, LISP will give you an error message.

What you usually want to do is this:

(set (quote X) 3)

which is the equivalent of X {- 3.

This is needed so often that there's a shorthand version of 
it, called "setq", which is short for "set quote":

? (setq X 3)  ;; setq binds a value to a symbol
3
? X
3

There's another function that does the same thing, plus a bit 
more.  It's called "setf":

? (setf X 3)
3
? X
3

SETF and the generalized variable

We tend to think of a variable as a place where you store a 
data object.  In reality, a variable in LISP is a place where 
you store a pointer to a data object.  A "generalized 
variable" in Common LISP is any expression that allows us to 
access a particular place in memory.  So, for example, when 
LISP evaluates:

(setf X '(A B C D))

the resulting construct in memory looks like this:

X  
|
|
|    _______       _______       _______       _______
+-->|   |   |     |   |   |     |   |   |     |   |  /|     
    | | | --+---->| | | --+---->| | | --+---->| | | / |     
    |_|_|___|     |_|_|___|     |_|_|___|     |_|_|/__|     
      |             |             |             |
      A             B             C             D

Assignment in LISP is nothing more than replacing one pointer 
with another (or initializing a pointer).  So X designates a 
pointer, and (nth 2 X) designates a different pointer:

X                          (nth 2 X)
|                            |
|                            |
|    _______       _______   |   _______       _______
+-->|   |   |     |   |   |  +->|   |   |     |   |  /|     
    | | | --+---->| | | --+---->| | | --+---->| | | / |     
    |_|_|___|     |_|_|___|     |_|_|___|     |_|_|/__|     
      |             |             |             |
      A             B             C             D

Both X and (nth 2 X) are place descriptions; they're both 
generalized variables.  Consequently, if we have LISP 
evaluate the expression:

? (setf (nth 2 X) 'Q)

the list above will be changed to:

X                          (nth 2 X)
|                            |
|                            |
|    _______       _______   |   _______       _______
+-->|   |   |     |   |   |  +->|   |   |     |   |  /|     
    | | | --+---->| | | --+---->| | | --+---->| | | / |     
    |_|_|___|     |_|_|___|     |_|_|___|     |_|_|/__|     
      |             |             |             |
      A             B             Q             D

and if we type the following at our LISP evaluator:

? X

we'll see LISP return:

(A B Q D)

If we tried this with the "setq" function, it wouldn't work, 
and that's the difference between "setq" and "setf".  The 
"setf" function goes into the actual place in memory that the 
pointer is pointing to and replaces the value there.  This is 
what's known as "destructive modification", and it's not 
usually what you want to do.  Why?  It's dangerous, 
especially in a communal programming environment.

Say for example that you have a system in which information 
is shared between functions via global variables.  There are 
times when this is the right thing to do.  To indicate 
globals in LISP, we typically use a function called "defvar", 
and we put asterisks around the variable name just to make it 
stand out:

? (defvar *X* '(A B C D))
*X*
?*X*
(A B C D)

But then you might set up another global variable like this:

? (defvar *Y* (rest (rest *X*)))
*Y*
? *Y*

What happens if some other unsuspecting soul comes by and, 
thinking that *X* and *Y* are independent, decides to change 
*Y*?

? (setf (first *Y*) 'Q)
Q
? *Y*
(Q D)

This looks like what we wanted to happen, but now when we go 
back and look at X, we find that it's changed, and we may not 
have wanted that:

? *X*
(A B Q D)

In situtations like this, it would be safer to make a copy of 
the data object before messing with it.  LISP has functions 
called "copy-list" and "copy-tree" for doing exactly this, so 
make sure you look them up in your book.

The obvious next question is, if destructive modification is 
inherently unsafe, especially in communal programming 
environments, why do it at all?  The answer is that copying 
data structures, especially if they're big, eats up space and 
time.  While one of the really nice things about a dynamic 
programming language like LISP is that you typically don't 
worry about memory management, the fact is that the more 
copying of data structures that you do, the more memory gets 
used up.  That'll get in your way sometimes.  Furthermore, as 
that same memory becomes unused, the space will have to be 
recovered by a "garbage collector" and returned to "available 
memory".  When the garbage collector is running, that also 
eats up time, and that's not desirable either.

So, people will use destructive modifications for the sake of 
efficiency, and when those situations occur where saving 
computational resources is critical, you should think about 
using destructive modifications too.  But of course, you 
won't encounter these situations in CS 2360, ok?

Other destructive assignment functions that you may see, 
especially if you look at LISP code in dialects that came 
before Common LISP, are "rplaca" and "rplacd".  The former is 
shorthand for "replace car" and is equivalent to:

(setf (car ...) ...) or (setf (first ...) ...)

The latter is shorthand for "replace cdr" and is equivalent 
to:

(setf (cdr ...) ...) or (setf (rest ...) ...)

Hence, "setf" in Common LISP combines the power of "setq", 
"rplaca", and "rplacd".

The cost of assignment

On a less pragmatic note, but from a more 
computational/theoretical perspective, assignment carries an 
additional cost.  Under the substitution model of evaluation, 
a variable name was bound to its value at the time the 
function was entered, and it didn't change during the 
execution of the function.  Life was simple.

By introducing assignment, we've disrupted our simple model 
of evaluation.  The values of variables depend on where you 
are in the control flow within the function.  Debugging is 
more complicated.  But more importantly, your simple model of 
evaluation is no longer powerful enough to describe what's 
going on.  You need a more powerful, and more complex, and no 
longer mathematically nice, evaluation model, which we won't 
talk about here.

By introducing assignment, you may be improving efficiency in 
terms of cycles used, space used, and so on.  You may also be 
improving the "aesthetic complexity"---your program may look 
less complicated (but that's not guaranteed, not by any stretch
of the imagination).  But you're also demonstrably increasing 
computational complexity.  That makes your life harder if you 
happen to be a compiler or interpreter writer, but that's no 
big deal, as we don't mind making computers do work.  What is 
a big deal is that assignment makes it harder to test, debug, 
or validate your software.  A procedure in which variable 
bindings don't change has to be easier to understand than one 
in which the bindings do change, no?  And when you're working 
with really big systems where big dollars, and more 
frequently, lives, are on the line, you want to put more 
value on reducing complexity than on improving efficiency.  
Massive software failures don't happen because programs are 
too slow or they're too big...they happen because the programs 
are too complex to be checked out thoroughly.

Does this mean you shouldn't use assignment?  No.  What it 
means is that you should realize that you're paying a price 
to do so.

Introducing more local variables

Sometimes you'd like to have more local variables in your 
functions than just what's available through the argument 
list.  One way to introduce more variables is to use a 
"helping function" and add additional variables in the 
argument list to the helping function.  But you already knew 
that.  

Another way to introduce additional local variables is to use 
LISP's "let" function:

(let ((variable1 value1)
      (variable2 value2)
           :
           :
      (variablen valuen))

      

                         )

The "let" form allows you to bind a bunch of values to 
variables, and makes all those variables local to the code 
embedded within the "let" form.  We'll see an example of how 
to use it in the next set of notes.



Copyright 1996 by Kurt Eiselt.  All rights reserved.

Last revised: May 7, 1996