CS 2360 - November 24, 1998

Lecture 18 -- Building Languages on Languages


Today we started a discussion about programming languages and
why some have long lifespans while many others have very short
lifespans.  But first we talked about something called...


The lexical closure

Let's take another look at the notion of process and state.  
Let's say that I want to construct a function that simulates 
a Coke (TM) machine.  To do this, I'll need a state variable 
to tell me how many cans of Coke there are in this machine.  
I could do this:

(defvar number-of-cans 5)

and then my function could be this:

(defun get-coke ()
  (cond ((> number-of-cans 0)
         (setf number-of-cans (- number-of-cans 1))
         (print "have a Coke"))
        (T (print "sorry, out of Coke"))))

I can now call "get-coke" five times, and I'll see the same 
message ("have a Coke"), but on the sixth call, I'll see a 
new message ("sorry, out of Coke").  I now have a function 
which knows its history -- it has state.

However, "number-of-cans" is a free variable -- that is, it's
initialized outside the lexical scope of the function in
which it's referenced, and I don't like that.  In this case,
it's a global variable that's modifiable by all functions except 
where that name is also being used as a local or lexically-
scoped variable, and that's a situation that's ripe for 
trouble.  I'd really like this variable to be local to just 
one specific Coke machine.  And I'm sure I don't want to have 
to create a separate global variable for each simulated Coke 
machine, especially if I'm going to simulate the behavior of 
some vending machine company that may own hundreds or 
thousands of individual Coke machines.  Keeping track of all 
that could be difficult, to say the least.

As you know, to create additional local variables beyond 
those that are given in a function's argument list, I can use 
LISP's "let" form.  So my revised "get-coke" might look like 
this:

(defun get-coke ()
  (let ((number-of-cans 5))
    (cond ((> number-of-cans 0)
           (setf number-of-cans (- number-of-cans 1))
           (print "have a Coke"))
          (T (print "sorry, out of Coke")))))


Now, "number-of-cans" is lexically-scoped within the confines 
of the "let" form, so no other function can clobber it.  But 
it doesn't work the way I want it to, because each time I 
call "get-coke", my local state variable "number-of-cans" is 
reset to the value 5.  Waaahhh!!!

Rethinking this, I decided that what I'd really really like 
to do is associate a variable with my Coke machine simulation 
such that:

1.  It's local to the function, so nobody else can mess 
    with it.

2.  The state is saved in this variable, even after the
    function has been exited, and the next time I call this
    function, the state variable contains exactly what was
    left in it when this function was last exited.

How can I do this?  We need to take advantage of lexical 
scoping:  we can combine the lexically-scoped variable of the 
"let" form with our "lambda" form that we learned about 
earlier, and I'll throw in the "function" function.  
I'll use pretty much the same code as above, but 
this new function, which I'll call "make-coke-machine" 
instead of "get-coke", doesn't have the same purpose as "get-
coke".  (I'll show you what "get-coke" looks like soon.)
When I called "get-coke", I expected a simulation of 
a Coke machine -- it would either give me a virtual Coke or 
it wouldn't.  But when I embed that same code inside a 
"lambda" form, which is in turn embedded inside a "function" 
form, I get a function which when called returns *another* 
function, which will then behave like a Coke machine when it 
is evaluated:

(defun make-coke-machine ()
  (let ((number-of-cans 5))
    (function (lambda ()
                (cond ((> number-of-cans 0)
                       (setf number-of-cans 
                             (- number-of-cans 1))
                       (print "have a Coke"))
                      (T (print "sorry, out of Coke")))))))

Now, evaluating "make-coke-machine" returns an executable 
function object with a built-in lexically-scoped variable, 
"number-of-cans", initially bound to the value 5.  For all 
intents and purposes, it is a special storage place known 
only to this particular instantiation of the function.

Why does it work?  When I define a lambda function and then 
use "function" to return the executable function object, the 
LISP system must save copies of bindings of any free 
variables within the lambda function at the time that the 
surrounding "function" was evaluated -- lexical scoping 
demands it.  Since "number-of-cans" is a free variable (again,
it's not defined in the argument list or in a "let" form 
within the lambda function -- it is a global variable from 
the point of view of the lambda function), LISP binds that 
variable to the value 5 and internalizes that when it creates 
the function object.  

After evaluating "make-coke-machine", I can do this:


? (setf cs-machine (make-coke-machine))
#[COMPILED-LEXICAL-CLOSURE #x5B840E]


and I'll have a simulated Coke machine with its own name, 
"cs-machine", and its own internal state variable.  Now I'll 
create a function which will allow me to use my simulated 
Coke machine:


? (defun get-coke (machine-name)
    (funcall machine-name))
GET-COKE


and that in turn will allow me to get a simulated Coke from 
my simulated Coke machine in the simulated computer science 
building:


? (get-coke cs-machine)

"have a Coke" 
"have a Coke"


Why do we see "have a Coke" twice?  I'll leave that as an 
exercise for you.  

There should be four more Cokes left in my machine.  

? (get-coke cs-machine)

"have a Coke" 
"have a Coke"

? (get-coke cs-machine)

"have a Coke" 
"have a Coke"

? (get-coke cs-machine)

"have a Coke" 
"have a Coke"

? (get-coke cs-machine)

"have a Coke" 
"have a Coke"

? (get-coke cs-machine)

"sorry, out of Coke" 
"sorry, out of Coke"


Sure enough, I could get five Cokes, but on the sixth try, 
the machine told me that it was empty.  At any time, I could 
have created another simulated Coke machine, which would have 
its own independent local state variable:


? (setf ee-machine (make-coke-machine))
#[COMPILED-LEXICAL-CLOSURE #x5BC4F6]

? (get-coke ee-machine)

"have a Coke" 
"have a Coke"


I can show that the state variable in the "ee-machine" 
simulator is independent of the state variable in the "cs-
machine" simulator, because even though there are Cokes in 
the "ee-machine", the "cs-machine" is still empty:


? (get-coke cs-machine)

"sorry, out of Coke" 
"sorry, out of Coke"


And I can show that the state variable is untouchable just by 
doing this:


? number-of-cans
> Error: Unbound variable: NUMBER-OF-CANS
> While executing: SYMBOL-VALUE
> Type Command-/ to continue, Command-. to abort.
> If continued: Retry getting the value of NUMBER-OF-CANS.
See the RestartsI menu item for further choices.
1 > 


These functions that I have created with their own internal 
local state variables are called "lexical closures".  
Sometimes these things are called "generators".  They can 
have multiple local variables, and you can also pass values 
to them via parameters at the time they are created, and 
those values will be internalized too.

A lexical closure is a great tool for simulating independent 
real-world objects where supply and demand is an important 
issue.  Examples are robots, operating systems, grocery store 
checkout lines, automated teller machines, pumps at the gas 
station, and yes, vending machines.


Static vs. Dynamic Languages

Certainly one of the things we've stressed in this class so 
far is the idea that software development involves making 
informed choices, and that those choices carry with them both 
costs and benefits.  For example, we've discussed more than 
once the trade-offs involved with functional versus non-
functional programming styles, even within the confines of a 
single programming language.  A choice between one type of 
programming language and another will also have impacts on 
how you approach a given problem, what form your solution 
will take, and what you'll be allowed to do in implementing 
that solution.  The more "mainstream" languages, which can be 
classified as "static languages", are based on computing 
technologies from the 1950s and 1960s, and they tend to force 
programmers into the following paradigm:

	They tend to be batch-compiled languages, meaning you 
	wait for the entire program to compile and link after 
	every change.

	Development environments are usually based on ASCII 
	source files and command-line interfaces (although there 
	are some notable exceptions recently).

	Compiled programs have little or no runtime type 
	checking, which usually results in a complete crash when 
	an error occurs.

	Memory is managed manually--code has direct memory 
	references or pointers.  Code is hard to write, 
	understand, and can create subtle, hard-to-fix bugs.  
	When managed manually, memory allocation is inefficient, 
	buggy, not portable, and hard to integrate with other 
	code libraries.  This is why memory-related errors are 
	responsible for the majority of all bugs in a typical 
	static language program.

C, C++, and Pascal are examples of static languages.

Another class of languages, called dynamic languages, have grown
in popularity in recent years.  These languages enable a different,
more interactive approach to programming:

	They allow faster development and rapid prototyping.

	They usually have dynamic type information--errors are 
	usually detected before execution, and when they do 
	occur, the errors are usually recoverable and do not 
	cause crashes.

	They usually have automatic memory management, which 
	hides the details of memory management from the 
	programmer.  Code is simpler, cleaner, more reliable, 
	and more reusable.

	They usually have a shell or "listener" so the user can 
	execute code interactively (either via an interpreter or 
	an incremental compiler).

What languages are dynamic? Dylan, Forth, Haskell, Logo, Java, 
Prolog, and Smalltalk.  And of course LISP and its dialects, 
including Scheme.  A big benefit of dynamic languages is that they
tend to be much more extensible than static languages.  That is,
when the needs of the programmer change, the language can be
extended to adapt to those changes.  Static languages are less
flexible in that regard.  This helps to explain why LISP has 
not disappeared after all these years.

Next week we'll see an example of LISP's extensibility when
we take a look at a small Scheme interpreter written in LISP.



Copyright 1998 by Kurt Eiselt.  All rights reserved.

Last revised: November 24, 1998