CS 1321X - Lecture 24 - November 11, 2003

CS 1321X - Lecture 24

Object-Oriented Programming the Hard Way


I. The lexical closure 

Let's take another look at the notion of process and state. Let's say that 
I want to construct a function that simulates a Coke (TM) machine. To do 
this, I'll need a state variable to tell me how many cans of Coke there 
are in this machine. I could do this: 

(define number-of-cans 5)

and then my function could be this: 

(define (get-coke)
  (cond ((> number-of-cans 0)
         (set! number-of-cans (- number-of-cans 1))
         (write "have a Coke"))
        (else (write "sorry, out of Coke"))))

I can now call "get-coke" five times, and I'll see the same message ("have 
a Coke"), but on the sixth call, I'll see a new message ("sorry, out of 
Coke"). I now have a function which knows its history -- it has state. 

However, "number-of-cans" is a free variable -- that is, it's initialized 
outside the lexical scope of the function in which it's referenced, and I 
don't like that. In this case, it's a global variable that's modifiable by 
all functions except where that name is also being used as a local or
lexically- scoped variable, and that's a situation that's ripe for trouble. 
I'd really like this variable to be local to just one specific Coke machine.  
And I'm sure I don't want to have to create a separate global variable for 
each simulated Coke machine, especially if I'm going to simulate the behavior 
of some vending machine company that may own hundreds or thousands of 
individual Coke machines. Keeping track of all that could be difficult, 
to say the least. 

As you know, to create additional local variables beyond those that are 
given in a function's argument list, I can use Scheme's "let" form. So my 
revised "get-coke" might look like this: 

(define (get-coke)
  (let ((number-of-cans 5))
    (cond ((> number-of-cans 0)
           (set! number-of-cans (- number-of-cans 1))
           (write "have a Coke"))
          (else (write "sorry, out of Coke")))))

Now, "number-of-cans" is lexically-scoped within the confines of the "let" 
form, so no other function can clobber it. But it doesn't work the way I 
want it to, because each time I call "get-coke", my local state variable 
"number-of-cans" is reset to the value 5. Waaahhh!!! 

Rethinking this, I decided that what I'd really really like to do is 
associate a variable with my Coke machine simulation such that: 

1.  It's local to the function, so nobody else can mess with it. 
    
2.  The state is saved in this variable, even after the function has been 
    exited, and the next time I call this function, the state variable 
    contains exactly what was left in it when this function was last exited. 

In other words, I want the data that's specific to the function encapsulated 
with the function, so that only that function can alter it.  How can I do 
this? We need to take advantage of lexical scoping: we can combine the 
lexically-scoped variable of the "let" form with something we've only 
talked about briefly:  the "lambda" expression.


II.  The lambda expression

Remember the lambda expression?  Bryan talked about it a few lectures ago.
But as a reminder, a lambda expression is nothing more than a function body 
without a name.  You've been using them all along without knowing about 
them, because the syntactic dialect of Scheme that we've been using hides 
that fact.  But when you type a function definition like this:

(define (square x)
  (* x x))

that's being translated into this form:

(define square
  (lambda (x)
    (* x x)))

How is that translation done?  With either special forms or macros (which 
are like special forms that you can write), but that's stuff you don't
need to worry about this semester.

So what does that lambda in the latter form mean?  It says "here's a function 
body that takes one parameter, x, and multiplies it by itself and returns 
that value.  Compile this function body -- that is, make it into something 
executable -- and bind that executable function body to the name 'square' so 
that when somebody invokes the function name square on an appropriate 
argument, this executable chunk of code will be executed."  Whew!

In short, the "lambda" tells Scheme to compile the function body that follows,
and return the executable result.  That is, lambda can be viewed as a function
that returns a function.

The name "lambda" itself comes from something called "lambda calculus", the 
mathematical precursor to functional programming that was developed by 
Alonzo Church.  You can read more about Church in a slightly-dense but still 
informative history of the mathematical march toward computers entitled 
"The Advent of the Algorithm" by David Berlinski.


III.  Back to the Coke machine simulation

Now that we know this stuff, what does it buy us?  How can we use this 
knowledge to solve our problem?

Well, the answer isn't immediately obvious, so don't be discouraged if it 
didn't just leap out at you.  In fact, it's sort of obscure, yet it's a 
really cool idea, which is this:  When lambda returns a compiled function 
body, it needs to bind real values to any free variables -- variables that 
aren't named in the parameter list or defined in a "let" inside the "lambda"
-- that are referenced inside that function body.  So if we reference some 
variable named foo, for example, inside that function body, and foo isn't in 
the parameter list, then Scheme either needs to bind foo to some value that 
it gets from the environment (like some let, define, or set! that happens 
outside of the lambda expression) when it compiles the function body, or it 
barfs.  And better yet, when it does bind that variable foo to some value, 
it internalizes foo in such a way that no other function can get access to 
it.  That is, this particular instance of the variable foo is accessible 
only to the function that includes the reference to foo!  That's exactly 
what we want.  So let's take that last approximation to "get-coke":

(define (get-coke)
  (let ((number-of-cans 5))
    (cond ((> number-of-cans 0)
           (set! number-of-cans (- number-of-cans 1))
           (write "have a Coke"))
          (else (write "sorry, out of Coke")))))

and take advantage of lambda to encapsulate the variable "number-of-cans" 
within the compiled function:

(define (get-coke)
  (let ((number-of-cans 5))
    (lambda ()
      (cond ((> number-of-cans 0)
             (set! number-of-cans (- number-of-cans 1))
             (write "have a Coke"))
            (else (write "sorry, out of Coke"))))))

Then we evaluate it, and things start to behave a little differently than 
before.  If I invoke (get-coke), I won't see:

> (get-coke)
"have a Coke"

Instead, I'll see this:

> (get-coke)
#

Why does that happen?  Because we've redefined the "get-coke" function so 
that it compiles its function body and returns that compiled function instead 
of applying that function body to an argument list and returning that result.
We've turned "get-coke" into a function that returns a function.

So now we have a function body that has no name.  How do we use it?  There 
are a couple of approaches worth looking at here.  The first approach is to 
use a tool that Scheme provides to us for using compiled but nameless 
functions.  It's called "apply", and it's the mechanism that Scheme uses to 
apply a compiled lambda function to its arguments.  In the case of "get-coke",
there are no arguments, so we could apply the result of invoking "get-coke" 
to the empty list as follows:

> (apply (get-coke) ())
"have a Coke" 

And just out of curiosity, what happens if we don't bind that "number-of-cans"
variable to some value?

(define (get-coke)
;  (let ((number-of-cans 5))
    (lambda ()
      (cond ((> number-of-cans 0)
             (set! number-of-cans (- number-of-cans 1))
             (write "have a Coke"))
            (else (write "sorry, out of Coke")))))  ;)

> (get-coke)
#
> (apply (get-coke) ())
reference to undefined identifier: number-of-cans
>

But I digress.  We still have a problem with the working version -- every
time we call "get-coke" here, we get a new copy of the "get-coke" function, 
so we never run out of Cokes.  We'll see another use for apply later...it's
just not what we need right now.

In the meantime, a better way would be to give the result of "get-coke" its 
own function name.  Let's call it "cs-machine":

> (define cs-machine (get-coke))
> (cs-machine)
"have a Coke"

This way, I get only one copy of the function that "get-coke" returns, and if 
I keep on invoking "cs-machine", I'll eventually run out of Cokes:

> (cs-machine)
"have a Coke"
> (cs-machine)
"have a Coke"
> (cs-machine)
"have a Coke"
> (cs-machine)
"have a Coke"
> (cs-machine)
"sorry, out of Coke"
> 

So the "get-coke" function no longer simulates my getting a Coke out of a 
Coke machine.  Instead, "get-coke" now creates a simple model of a Coke 
machine which in turn I can bind to a unique name.  And then when I invoke 
the name of that Coke machine, it behaves like a Coke machine -- it gets me 
a Coke.  So we really ought to change the name of the function from "get-coke"
to "make-coke-machine":

(define (make-coke-machine)
   (let ((number-of-cans 5))
     (lambda ()
       (cond ((> number-of-cans 0)
              (set! number-of-cans (- number-of-cans 1))
              (print "have a Coke"))
             (else (print "sorry, out of Coke"))))))

Once again, after evaluating "make-coke-machine", I can do this: 

> (define cs-machine (make-coke-machine))
> cs-machine
#
>

and I'll have a simulated Coke machine with its own name, "cs-machine", and 
its own internal state variable. 

I can simulate the purchase of one icy can of Atlanta's own nectar of the 
gods just by calling the function now named "cs-machine". 

> (cs-machine)
"have a Coke"

How many cans are left? Well, there had better be four. But note that I can't 
really tell, because I can't access the variable that's encapsulated within 
the "cs-machine" procedure: 

> number-of-cans
reference to undefined identifier: number-of-cans
>

I'm prevented from accessing that variable; think of that variable as 
something that can be accessed from within that procedure, but not from 
outside...it's private. 

But now let's say I want to simulate a whole college campus full of Coke 
machines. And I want each of those machines to act independently of each 
other...buying a Coke from one shouldn't change the number of Cokes in the 
another, as might happen with some of the approaches described above. Well, 
that's already done. Let me reset my computer science Coke machine: 

> (define cs-machine (make-coke-machine))
>

And now I'll create another machine that we'll put in electrical engineering: 

> (define ee-machine (make-coke-machine))
>

Then a lot of thirsty EE majors come by and buy all the Cokes out of 
ee-machine: 

> (ee-machine)
"have a Coke"
> (ee-machine)
"have a Coke"
> (ee-machine)
"have a Coke"
> (ee-machine)
"have a Coke"
> (ee-machine)
"have a Coke"
> (ee-machine)
"sorry, out of Coke"
>

If ee-machine and cs-machine are truly independent, thirsty CS majors should 
still have a full load of five Cokes in their machine, even though the 
hoggish EE's have emptied their machine: 

> (cs-machine)
"have a Coke"
> (cs-machine)
"have a Coke"
> (cs-machine)
"have a Coke"
> (cs-machine)
"have a Coke"
> (cs-machine)
"have a Coke"
> (cs-machine)
"sorry, out of Coke"
>

As a bit of a convenience (this isn't a big conceptual detail, just a frill), 
I can write a more generic Coke-buying function that allows me to just pass 
the name of the Coke-machine I want to buy from: 

(define (get-coke machine)
  (apply machine ()))

And that's what the "get-coke" function looks like now!

Then I reset a machine: 

> (define cs-machine (make-coke-machine))

And I buy a Coke like this:

> (get-coke cs-machine)
"have a Coke"
>

This is pretty cool stuff, but my simulation is a little bit awkward. For 
example, every time I want to reload my Coke machine, I have to redefine my 
Coke machine. It would be nice if I could just send a message to my Coke 
machine and tell it that I either want to buy a Coke or reload it with a 
fresh supply. Here's how I can do that (and note that there might be other 
ways to accomplish the same thing, and they might even be more elegant, but 
this will suffice for now): 

(define (make-coke-machine-v2)
  (let ((number-of-cans 0))
    (lambda (dowhat)
      (cond ((equal? dowhat 'load)
             (set! number-of-cans 20)
             (display number-of-cans)
             (display " frosty cans waiting for you") (newline))
            ((and (equal? dowhat 'buy) (> number-of-cans 0))
             (set! number-of-cans (- number-of-cans 1))
             (display "have a Coke") (newline)
             (display "cans remaining: ")
             (display number-of-cans)
             (newline))
            ((equal? dowhat 'buy)
             (display "sorry, out of Coke"))
            (else (print "please use correct change"))))))

Now I can create as many instances of Coke machines as I want, and I can 
send them each messages telling them how I want them to behave: 

> (define cs-machine (make-coke-machine-v2))
> (cs-machine 'buy)
sorry, out of Coke
> (cs-machine 'load)
20 frosty cans waiting for you
> (cs-machine 'buy)
have a Coke
cans remaining: 19
> (cs-machine 'buy)
have a Coke
cans remaining: 18
> (cs-machine 'buy)
have a Coke
cans remaining: 17
> (cs-machine 'foo)
"please use correct change"
>

These functions that I have created with their own internal local state 
variables are called "lexical closures". Sometimes these things are called 
"generators". They can have multiple local variables, and you can also pass 
values to them via parameters at the time they are created, and those values 
will be internalized too. 

A lexical closure is a great tool for simulating independent real-world 
objects where supply and demand is an important issue. Examples are robots, 
operating systems, grocery store checkout lines, automated teller machines, 
pumps at the gas station, and yes, vending machines. 

This lexical closure idea is a teeny weeny bit obscure and unwieldly, yet it 
allows me to write programs that model the world as a bunch of independent 
but interacting objects...and that's the kind of thing that programmers want 
to do all the time: build models where the world consists of classes of 
things (coke-machines, for example) and instances of classes (like cs-machine 
vs. ee-machine...we could call them objects) and messages passed between 
objects that tell them what to do (buy and load). 

Hey, since building programs like that is pretty desirable, wouldn't it be 
nice if there were another way of thinking about programming that made this 
sort of stuff easier? And wouldn't it be even nicer if there were programming 
languages that made writing these kinds of programs easier?  

Patience grashoppers...patience.



Copyright (c) 2003 by Kurt Eiselt.  All rights reserved, with 
the exception of stuff that belongs to somebody else.

Last revised: November 11, 2003