CS 1321X - Lecture 23 - November 6, 2003

CS 1321X - Lecture 23

Enough Stuff About Vectors To Make Your Head Explode


I.  I lied about the scoping

Last time we talked about iterating across vectors.  Let's try a 
different example that involves mutating or changing the values 
in a vector. Let's say I have a vector filled with numbers, 
and I want to square all the numbers and return that vector. I'll 
define the vector like this:

>  (define x (vector 1 2 3 4 5))
>  x
#5(1 2 3 4 5)
>

And then use this nifty procedure:

(define (square-vector sqrvector)
    (do ((index 0)
         (value 0)
         (max (- (vector-length sqrvector) 1)))  
      ((> index max) sqrvector)
      (set! value (vector-ref sqrvector index))
      (vector-set! sqrvector index (* value value))
      (set! index (+ index 1))))


So what happens when I run it?

>  (square-vector x)
#5(1 4 9 16 25)
>

Well, that's exactly what I want. The procedure returns a 
five-element vector with the original elements squared. Of course, 
it's just a copy, right? The original vector that I passed to 
square-vector is still intact, isn't it?

>  x
#5(1 4 9 16 25)
>


Uh oh. Now there's trouble. It's not supposed to work that way. What 
happened to local scoping and all that?  The changes I make to some
locally or lexically scoped variable aren't supposed to extend beyond
the lifetime of the function where that variable was created.  

Maybe I could fix it by making a copy of the vector before I go
messing with it....

(define (square-vector sqrvector)
    (do ((index 0)
         (value 0)
         (numvector sqrvector)
         (max (- (vector-length sqrvector) 1)))    ;; why can't I use
                                                   ;; numvector here??
                                                   ;; because I can't count
                                                   ;; on these bindings
                                                   ;; happening in the order
                                                   ;; shown, remember?
      ((> index max) numvector)
      (set! value (vector-ref numvector index))
      (vector-set! numvector index (* value value))
      (set! index (+ index 1))))

> x
#5(1 2 3 4 5)
> (square-vector x)
#5(1 4 9 16 25)
> x
#5(1 4 9 16 25)
>

That doesn't work either.  What happened?


II. Parameter passing 

In short, we discovered that if we pass a vector as an argument to a
function through the parameter list, and if that function changes a
value in that vector, the change is NOT local to the function making
the change...instead, contrary to what we've been seeing until now,
that change will be visible to the calling function that passed the
vector. 

That behavior of course violates everything we've seen all semester,
not to mention what we just recently learned about lexical scoping.  We've
been encouraged to think that Scheme makes a copy of everything that's
been passed to a function via the parameter list, so changes to values
of parameters should have local scope. So Scheme looks like a language
whose parameter passing is what's called "pass-by-value" (or sometimes
"call-by-value"), in which a calling function passes a copy of the
value of an argument to the called function. 

But if that vector above was a copy, and in fact, we even made a copy
of that copy in this particular case, why did a change to a cell in the
vector have more global effects than expected. The answer my friend, is
blowing in the wind...no, that's not it...the answer is that Scheme
acts like it's a "pass-by-value" language as long as we stay within the
purely functional programming paradigm.  It's not implemented as a
"pass-by-value" language... more about that in a moment...but as long
as it looks like a "pass-by-value" language, then we can call it a
"pass-by-value" language. And it always looks like a "pass-by-value"
language when we don't use any side effects. (See, I told you that side
effects just make things complicated.) Parameters that are
"pass-by-value" are often called "in parameters", as we've said before,
indicating that information goes into the called function from the
calling function, but information doesn't go out of the called function
back to the calling function through those same parameters. 

So what really happens when parameters are passed in Scheme? Well, it's
true that copies of values are being always being passed, so Scheme is
technically still a "pass-by-value" language. What makes things weird
here is that what's being passed from calling function to called
function is not the data you think is being passed, but pointers to (or
addresses of) that data. That is, what's being passed through the
parameters is a copy of the value of the address of the data; it's
what's called a "reference". So in that sense, Scheme is a
"pass-by-reference" language. Let's say a calling function passes a
pointer to a list, for example, to a called function, and the called
function conses something to that list, like this: 


(define (foo)
   (let ((y '(a b c)))
     (bar y)
     (print y)))

(define (bar x)
   (set! x (cons 'd x))
   (print x)
   (newline))

Welcome to DrScheme, version 100alpha4.
Language: MzScheme.
> (foo)
(d a b c)
(a b c)
>

What happens to x? When Scheme conses 'd onto '(a b c), the pointer
bound to x is changed to point to a new cons cell containing 'd, and
that cell now points to the list '(a b c) as being the cdr of the list. 
So the copy of the address that was in x got changed to some new value, 
and that effect is local to the function bar. The function foo will never 
know the difference. The data structure '(a b c) that was at the end of that
original pointer is unchanged. 

Similarly, if foo passed the number 6 to bar, and bar changed that
parameter to 7, the effect would be local to bar, since changing the
value to 7 is really just changing a pointer that points to a memory
cell containing the value 6 to a pointer that points to a memory cell
containing the value 7. 

However, if foo passed a pointer to a vector to bar, and bar used
vector-set! to change the value of a cell within that vector, Scheme
then follows the pointer to where that vector starts in memory, finds
the cell indicated by the index, and puts something different in that
cell. Sure, the pointer to that vector is a copy of the original
pointer, but it's no longer the pointer that's being manipulated...
it's the data structure at the end of the pointer. So when the vector
at the end of that pointer is changed, it's the same vector that foo
had a pointer to, and the change made by bar is felt by foo: 

(define (foo)
   (let ((y (vector 0 0 0)))
     (bar y)
     (print y)))

(define (bar x)
   (vector-set! x 1 'foo)
   (print x)
   (newline))

Welcome to DrScheme, version 102.
Language: Textual Full Scheme (MzScheme).
> (foo)
#3(0 foo 0)
#3(0 foo 0)
>

As we've said at some point previously, when a parameter allows
information to be passed from the called function to the calling
function, it's called an "out parameter". And when a parameter allows
information to pass in both directions, it's called an "in/out
parameter". Clearly, the "pass-by-reference" thing that goes on in
Scheme is making in/out parameters, not just in parameters. 

So long as the operations in the called function manipulate the
pointers (or addresses or references) but don't alter the data
structures at the ends of those pointers, it's going to look like 
"pass-by-value" and the changes have local scope. But if the operations 
in the called function maintain the pointers to the data structures 
while changing the data structures at the ends of the pointers, it's 
going to look like "pass-by-reference" and the effects of the changes 
will not be local to the called function. 

And don't think that list structures are somehow immune. Again, it's
all about the operations on the data structures. Check out set-car! and
set-cdr! to see how to mess with a list in function bar above and have
the effects seen by function. Here's an example of set-car! in action:

(define (foo)
   (let ((y '(a b c)))
     (bar y)
     (print y)))

(define (bar x)
   (set-car! x 'z)
   (print x)
   (newline))


Welcome to DrScheme, version 102.
Language: Textual Full Scheme (MzScheme).

> (foo)
(z b c)
(z b c)
>

Don't you wish you'd known about set-car! and set-cdr! before now? 
We don't.


III. Stacks as vectors

You all know what a stack is already...we talked about them before. 
But for those of you who were snoozing, a stack is a linear data 
structure in which objects are added to ("pushed") or removed from 
("popped") the same end. That end is called the "top" of the stack. 
Thus as things are added pushed onto the top of the stack, the stack 
grows in the direction toward which things are added. And when things 
are removed from that end, the stack shrinks back toward the bottom 
end. Consequently, one end, called the "bottom", is always fixed. The 
stack is a commonly-used data structure, and in fact, that activation 
stack we've talked about frequently is one common use of the stack 
data structure. Items are popped off the stack in the reverse
order that they are pushed on the stack, so a stack is a way of 
reversing the order of things: the things that were pushed on most 
recently are going to be popped first and presumably dealt with 
sooner than things that were pushed earlier. So a stack is a 
last-in-first-out or LIFO structure.

We can implement a stack in Scheme as a vector (though it's not the 
only way...you can make a Scheme linked list into a stack by making 
sure you only take things off the same end you cons things onto). 
Let's make a really small stack:

(define stacksize 2)

(define mystack (make-vector stacksize))

And we'll initialize a global variable called mytop that points to 
the vector cell that contains the topmost element. We'll initialize 
that variable to -1 to indicated that the stack is empty. If the 
stack had something in the first cell, whose index is 0, then mytop 
would contain the value 0.

(define mytop -1)

Here's what the "push" operation looks like:

(define (push item stack)
    (if (>= mytop (- stacksize 1))
        #f
        (begin (set! mytop (+ 1 mytop))
               (vector-set! stack mytop item))))

And here's what happens when we put the stack through its paces:

Welcome to DrScheme, version 100alpha4.
Language: MzScheme.

>  mystack
#2(0)
>  mytop
-1
>  (push 'foo mystack)
>  mystack
#2(foo 0)
>  mytop
0
>  (push 'bar mystack)
>  mystack
#2(foo bar)
>  mytop
1
>

We've pushed two things on our two-cell stack, and sure enough, 
DrScheme shows us two things in the vector and mytop points to the 
second cell in that vector. Now we try to push one more thing, but 
"push" won't let it happen and returns #f:

>  (push 'baz mystack)
#f
>  mystack
#2(foo bar)
>  mytop
1
>

To put more things on the stack, we'll have to take something off the 
stack. We do that with the "pop" operation. We'll write pop so that 
it returns the thing that was popped of the top of the stack:

(define (pop stack)
    (let ((item ()))
      (if (< mytop 0)
          #f
          (begin (set! item (vector-ref stack mytop))
                 (set! mytop (- mytop 1))
                 item))))

The "pop" function could also be written like this:

(define (pop stack)
  (if (< mytop 0)
      #f
      (begin (set! mytop (- mytop 1))
             (vector-ref stack (+ mytop 1)))))

Let's push a few things on a five-cell stack this time:

Welcome to DrScheme, version 100alpha4.
Language: MzScheme.

>  (define mystack (make-vector 5))
>  mystack
#5(0)
>  (push 'a mystack)
>  (push 'b mystack)
>  (push 'c mystack)
>  mystack
#5(a b c 0)
>  mytop
2

And now we pop something off the stack...I hope the "pop" procedure 
returns the symbol c....

>  (pop mystack)
c
>  mystack
#5(a b c 0)
>  mytop
1


It sure does. Whew!

>  (pop mystack)
b
>  mytop
0
>  mystack
#5(a b c 0)
>  (push 'foo mystack)
>  mystack
#5(a foo c 0)
>  mytop
1
>

In programming land, it's not uncommon to see the interface to a 
push be exactly what we've defined: the push command, the item to 
push, and the stack to push it on. Similarly for pop: we just say 
"pop" and give the stack to be popped. But I've got this global 
variable for the top of the stack sitting out there that anyone can 
mess with, and how do I know I'm using the right top of stack pointer 
if I've got lots of stacks to keep track of?

I could take over more of the management myself, and just pass the 
top of stack pointer that I want:

(define (push item stack mytop)
    (if (>= mytop (- stacksize 1))
        #f
        (begin (set! mytop (+ 1 mytop))
               (vector-set! stack mytop item))))

The only problem is that this new approach just doesn't work, because 
mytop is now local to "push":


Welcome to DrScheme, version 100alpha4.
Language: MzScheme.
>  mystack
#5(0)
>  mytop
-1
>  (push 'foo mystack mytop)
>  mystack
#5(foo 0)
>  mytop
-1
>  (push 'bar mystack mytop)
>  mystack
#5(bar 0)
>  mytop
-1
>


In other words, every time I push, I just write over the previous 
thing I pushed. How come? That's just life in imperative programming 
land. Whether or not the results of your mutations on data objects 
get passed back through the parameter list and have effects outside 
of the called procedure depends on the type of the object that's been 
passed through the parameter list and the type of operation that's 
been done on the object. Change the value of some integer type that's 
passed as a parameter, and all you're doing is manipulating a copy of 
the value (pass-by-value) and there are no side effects outside the 
called procedure. Change the value of some cell in a vector type 
that's passed as a parameter, and you're working on the actual 
vector, not a copy (pass-by-reference). And if you think it's just 
some weird Scheme thing, note that C has pass-by-value parameter 
passing...except when you're passing arrays, and C becomes a 
pass-by-reference language. C is nice enough to let you know in 
advance that that's what happens...I'm told that Pascal hides the 
fact from you and lets you figure it out for yourself. Nice language. 
And for you Java fans, note that Java's model of parameter passing is 
pretty much the same as Scheme's model:

  "The difference between variables of primitive types and objects 
  (reference types) has implications for parameter passing to methods, too.  
  Variables of primitive types are passed by value; objects are passed by 
  reference. 'Passing by value' means that the argument's value is copied, 
  and is passed to the method. Inside the method this copy can be modified 
  at will, and doesn't affect the original argument. 'Passing by reference' 
  means that a reference to (i.e., the address of) the argument is passed 
  to the method. Using the reference, the method is actually directly 
  accessing the argument, not a copy of it. Any changes the method makes to 
  the parameter are made to the actual object used as the argument. After 
  you return from the method, that object will retain any new values set 
  in the method. What's really going on here is that a copy of the value 
  that references an object argument is passed to the method. This is why 
  some Java books say 'everything is passed by value'---the object reference 
  is passed by value which effectively passes the object itself by reference." 

  - from "Just Java 2" by Peter van der Linden (p. 48)

Sorry, there's just no getting away from it, is there?  Like I said, 
when you add side-effects, life just gets harder and harder.

Let's get back to my implementation of the stack data type.
What I'd really like to do is to keep all that information about 
stacksize and the top of stack pointer with the data structure 
itself, so that if I have many stacks I'm dealing with, they each 
contain their own important data. As long as I'm consistent in what I 
store and where I store it, I can use the same procedures to work on 
lots of different stacks. In wrapping up a data structure and its 
operations into one nice package, I'm creating an abstract data type 
called a stack with its associated procedures push and pop. And in so 
doing, I'm doing something called "encapsulation", which we'll see 
again and again in the remainder of the class.


(define stacksize 10)

(define mystack (make-vector stacksize))

(vector-set! mystack 0 stacksize)   ;; let's keep the stacksize with the stack
                                    ;; in vector cell 0
(vector-set! mystack 1 1)           ;; we'll keep the pointer to the top of the
                                    ;; stack in cell 1, and since the first
                                    ;; empty cell in the stack would be at
                                    ;; index 2, we'll initialize the top of
                                    ;; stack to 1 to indicate the stack is
                                    ;; empty

(define (push item stack)
    (if (>= (vector-ref stack 1) (- (vector-ref stack 0) 1))
        #f
        (begin (vector-set! stack 1 (+ 1 (vector-ref stack 1)))
               (vector-set! stack (vector-ref stack 1) item))))


(define (pop stack)
    (let ((item ()))
      (if (< (vector-ref stack 1) 2)
          #f
          (begin (set! item (vector-ref stack (vector-ref stack 1)))
                 (vector-set! stack 1 (- (vector-ref stack 1) 1))
                 item))))


Welcome to DrScheme, version 100alpha4.
Language: MzScheme.

>  mystack
#10(10 1 0)
>  (push 'a mystack)
>  (push 'b mystack)
>  (push 'c mystack)
>  mystack
#10(10 4 a b c 0)


Pretty slick, eh? We pushed three things on the stack, DrScheme 
displays three things on the stack past the first two cells that 
contain the stacksize (10) and the index (4) of the cell that 
contains the most recently pushed entity. Now we pop stuff, and the 
index to the top of stack changes, but the contents of the cells that 
follow don't change at all. Do we care? No...as long as the pop 
reduced the top of stack pointer by one, everything will be fine. 
Pushing something new will write over the previous value stored in 
that cell:

>  (pop mystack)
c
>  mystack
#10(10 3 a b c 0)
>  (pop mystack)
b
>  mystack
#10(10 2 a b c 0)
>  (pop mystack)
a
>  mystack
#10(10 1 a b c 0)
>  (pop mystack)
#f
>  mystack
#10(10 1 a b c 0)
>  (push 'foo mystack)
>  mystack
#10(10 2 foo b c 0)
>


While we've shown off some nice data abstraction here, we've not 
done a very good job of procedural abstraction. We could make all 
those references to the top of stack pointer a little bit less 
obscure:

(define (push item stack)
    (if (>= (get-stack-top stack) (- (get-stack-size stack) 1))
        #f
        (begin (vector-set! stack 1 (+ 1 (get-stack-top stack)))
               (vector-set! stack (get-stack-top stack) item))))

(define (pop stack)
    (let ((item ()))
      (if (< (get-stack-top stack) 2)
          #f
          (begin (set! item (vector-ref stack (get-stack-top stack)))
                 (vector-set! stack 1 (- (get-stack-top stack) 1))
                 item))))

(define (get-stack-top stack)
    (vector-ref stack 1))

(define (get-stack-size stack)
    (vector-ref stack 0))


Of course, we could rewrite that last function as a call to 
vector-length, and save ourselves one cell in the vector since we 
wouldn't have to store the stack size, but I'm just trying to make 
things as obvious as possible.  Also, there are times when having the 
information about the stack size can be handy, such as when you've 
saved your stack on your disk drive, but there's been some massive 
disk failure and now in order to recover your data from the disk you 
need all the information you can get by reading what's left of the 
disk.  Recreating your stack will be a lot easier if the information 
about stack size is explicitly saved in the stack...trust me.

I'm sure you could optimize this stack stuff even more. And speaking 
of things for you to do, I'm still not happy with the abstraction 
here (although it's better than it was), so I'll leave it to you to 
make it even better.


IV.  Queues as vectors 

From this point until the end of this particular set of  notes, we're 
talking about stuff we didn't cover in class, but you're still expected 
to know this stuff.  It's just more things we can do with vectors.

If you remember back to the discussion of breadth-first search, where we 
introduced the queue data structure, you'll recall that a queue is a linear 
data structure similar to a stack, but with one big difference. With a stack, 
data is added or removed from the same end. With a queue, data is added to 
one end (called the "back" of the queue) and removed from the other end 
(called the "front" of the queue). Thus a queue is just like a line that 
you wait in to get tickets at the movies...you join the line at the back 
end, and you slowly creep forward to the front. When you finally reach the 
front of the line, you leave the line, head to a ticket window and buy your 
tickets. Then you join the line for the popcorn... 

A queue preserves the order in which things were added to the queue, so a 
queue is a first-in-first-out or FIFO data strucuture. The operation to add 
something to the back of the queue is typically called "enqueue", while the 
operation to remove something from the front of the queue is typically called 
"dequeue". Here's some Scheme code to implement a queue as a vector. Note 
also that as things are added to the queue and removed from the other, the 
queue itself crawls through the length of the vector like a worm. It's 
possible to do add a little bit extra math in the procedures below to allow 
the queue to wrap around from the end of the vector (cell 19 in this case) to 
the front of the vector (cell 3), but again I'll leave that as an exercise 
for you, the home viewer. 


(define queuesize 20)
(define myqueue (make-vector queuesize))
(vector-set! myqueue 0 queuesize)  ;; how much room in queue in cell 0
(vector-set! myqueue 1 3) ;; pointer to front of queue in cell 1
(vector-set! myqueue 2 2) ;; pointer to back of queue in cell 2
                          ;; if back pointer is less than front pointer,
                          ;; then the queue is empty

(define (enqueue item queue)
  (if (queue-full? queue)
      #f
      (add-to-back-of-queue item queue)))

(define (dequeue queue)
  (if (queue-empty? queue)
      #f
      (retrieve-from-front-of-queue queue)))

(define (queue-full? queue)
  (>= (get-queue-back queue) (- (get-queue-size queue) 1)))

(define (queue-empty? queue)
  (> (get-queue-front queue) (get-queue-back queue)))

(define (retrieve-from-front-of-queue queue)
  (let ((item ()))
    (begin (set! item (vector-ref queue (get-queue-front queue)))
           (vector-set! queue 1 (+ (get-queue-front queue) 1))
           item)))

(define (add-to-back-of-queue item queue)
  (begin (vector-set! queue 2 (+ 1 (get-queue-back queue)))
         (vector-set! queue (get-queue-back queue) item)))

(define (get-queue-back queue)
  (vector-ref queue 2))

(define (get-queue-front queue)
  (vector-ref queue 1))

(define (get-queue-size queue)
  (vector-ref queue 0))



Welcome to DrScheme, version 100alpha4.
Language: MzScheme.

> myqueue
#20(20 3 2 0)

> (enqueue 'a myqueue)
> (enqueue 'b myqueue)
> (enqueue 'c myqueue)

> myqueue
#20(20 3 5 a b c 0)

> (dequeue myqueue)
a

> myqueue
#20(20 4 5 a b c 0)

> (dequeue myqueue)
b

> myqueue
#20(20 5 5 a b c 0)

> (dequeue myqueue)
c

> myqueue
#20(20 6 5 a b c 0)
 > (dequeue myqueue)
#f

> myqueue
#20(20 6 5 a b c 0)

> (enqueue 'foo myqueue)

> myqueue
#20(20 6 6 a b c foo 0)
>


V.  Linear search on a vector 

So let's say we have some information stored in a vector. How do we find one 
specific chunk of it? One way to do so would be to start at the first 
location in the vector (index 0) and do a linear search for our target. 
Here's an example vector: 


(define testvector (vector 1 2 5 6 11 15 23 25 42 50
                           0 0 0 0  0  0  0  0  0  0))


And here's a picture of what it looks like...note that the vector extends 
beyond index 9 through index 19, but it's all zeros... 


+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |     |     |
|  1  |  2  |  5  |  6  |  11 |  15 |  23 |  25 |  42 |  50 |.....
|     |     |     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

   0     1     2     3     4     5     6     7     8     9


Why look, here's some code for doing the linear search. The function called 
linear-search-of-vector is really just a wrapper for the function that does 
the real work, which is called find-index. The linear-search-of-vector 
function simply calls find-index. The find-index function then does the 
search for the desired item in the vector, starting with the first location. 
Find-index returns the index of the item if it finds what it's looking for, 
and returns a number that is one greater than the maximum allowable index for 
that vector if it doesn't find what it's looking for. That's what just sort 
of naturally fell out of writing this function while looking at the pseudocode
approach in the book used in the old pseudocode days of our introductory
computing course.  Then to have the linear-search-of-vector function return a 
more meaningful answer, I have it return #f instead of an index that's too 
big to indicate that the target item wasn't found.  That's really the only
reason for the additional wrapper.

(define (linear-search-of-vector invector item)
  (let ((index (find-index invector item)))
    (if (> index (- (vector-length invector) 1))
        #f 
        index)))

(define (find-index invector item)
  (do ((index 0) (maxindex (- (vector-length invector) 1)))
    ((or (> index maxindex)
         (= item (vector-ref invector index)))
    index)
    (set! index (+ index 1))))

Welcome to DrScheme, version 102.
Language: Textual Full Scheme (MzScheme).
> (linear-search-of-vector testvector 15)
5
> (linear-search-of-vector testvector 50)
9
> (linear-search-of-vector testvector 1)
0
> (linear-search-of-vector testvector 0)
10
> (linear-search-of-vector testvector 51)
#f


For those of you who are challenged by the organization of the functions 
above and would rather have the test to see whether the index where the 
search stops is a valid index inside the do structure, here's what that 
looks like: 


(define (linear-search-of-vector invector item)
  (do ((index 0) (maxindex (- (vector-length invector) 1)))
    ((or (> index maxindex)
         (= item (vector-ref invector index)))
     (if (> index maxindex)
         #f
         index))
    (set! index (+ index 1))))


But I've never been a fan of putting anything even remotely complicated 
like an if-structure in the action-expression part of the termination test 
in a do-structure (ick!  it just makes things look all cluttered up!), so 
I'd abstract that action-expression away as a separate function, and now 
the whole thing looks a lot like what we started with:

(define (linear-search-of-vector invector item)
  (do ((index 0) (maxindex (- (vector-length invector) 1)))
    ((or (> index maxindex)
         (= item (vector-ref invector index)))
     (return-appropriate-response index maxindex))
    (set! index (+ index 1))))

(define (return-appropriate-response index maxindex)
  (if (> index maxindex)
      #f
      index))

There are so many different ways to do things, aren't there?


VI.  Binary search on a sorted vector 

If the items in my vector are sorted in some way, I'm not stuck with pokey 
old linear search. I can apply the same sort of binary search that we 
discussed in previous lectures. We'll need to keep track of the begin and end 
points of the part of the vector that remains to be searched. To start with, 
we'll store that information as a list in index 0 of our vector. The list 
contains two numbers, 1 and 9, indicating that even though this is a 
20-element vector (with the first element reserved for that list I just 
mentioned), the first element actually being used in this vector is at index 
1 and the last element used is at index 9...the remaining elements are just 
there if we decide we want to use them later for something, which we won't. 


(define testvector (vector '(1 9) 2 5 6 11 15 23 25 42 50
                               0  0 0 0  0  0  0  0  0  0))

The first function just retrieves the indices to the first and last elements 
of the vector and passes them to the next function, which does all the work. 

(define (binary-search-of-vector invector item)
  (binary-search-vec-help invector item
                          (car (vector-ref invector 0))
                          (cadr (vector-ref invector 0))))

The binary-search-vec-help function then computes the index of the middle 
element of the vector, retrieves the item that's there, and tests to see if 
it's the target. If so, we're done. If not, the function determines whether 
the target is less than the item at the midpoint or greater than that item. 
The indices to the start and end of that portion of the vector yet to be 
searched are updated accordingly, and binary-search-vec-help calls itself. 

(define (binary-search-vec-help invector item first last)
  (let ((middle-element (get-middle-element invector 
                                            (compute-mid first last))))
    (cond ((equal? item middle-element) item)
        ((= first last) 
         (display middle-element)      ;; for tracing
         (newline)                     ;; for tracing
         #f)
        ((< item middle-element)
         (display middle-element)      ;; for tracing
         (newline)                     ;; for tracing
         (binary-search-vec-help invector 
                                 item 
                                 first 
                                 (- (compute-mid first last) 1)))
        (else 
              (display middle-element) ;; for tracing
              (newline)                ;; for tracing
              (binary-search-vec-help invector 
                                      item 
                                      (+ (compute-mid first last) 1)
                                      last)))))
   
(define (compute-mid first last)
  (ceiling (/ (+ first last) 2)))

(define (get-middle-element invector index)
  (vector-ref invector index))


Graphically, here's what should happen in searching for the target number 
23 in this sorted vector:


+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |     |     |
|(1 9)|  2  |  5  |  6  |  11 |  15 |  23 |  25 |  42 |  50 |.....
|     |     |     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

   0     1     2     3     4     5     6     7     8     9

         ^                       ^                       ^
         |                       |                       |
       first                    mid                    last

+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |     |     |
|(1 9)|  2  |  5  |  6  |  11 |  15 |  23 |  25 |  42 |  50 |.....
|     |     |     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

   0     1     2     3     4     5     6     7     8     9

                                       ^           ^     ^
                                       |           |     |
                                     first        mid  last

+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |     |     |
|(1 9)|  2  |  5  |  6  |  11 |  15 |  23 |  25 |  42 |  50 |.....
|     |     |     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

   0     1     2     3     4     5     6     7     8     9

                                       ^     ^     
                                       |     |     
                                     first  mid
                                           last  


+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+
|     |     |     |     |     |     |     |     |     |     |
|(1 9)|  2  |  5  |  6  |  11 |  15 |  23 |  25 |  42 |  50 |.....
|     |     |     |     |     |     |     |     |     |     |
+-----+-----+-----+-----+-----+-----+-----+-----+-----+-----+

   0     1     2     3     4     5     6     7     8     9

                                       ^          
                                       |          
                                     first  
                                      mid
                                     last
                              (and the found item) 


And here are those Scheme functions from above at work... 

                               
Welcome to DrScheme, version 102.
Language: Textual Full Scheme (MzScheme).

> testvector
#20((1 9) 2 5 6 11 15 23 25 42 50 0)

> (binary-search-of-vector testvector 23)
15
42
25
23

> (binary-search-of-vector testvector 15)
15

> (binary-search-of-vector testvector 11)
15
6
11

> (binary-search-of-vector testvector 2)
15
6
5
2

 > (binary-search-of-vector testvector 50)
15
42
50

> (binary-search-of-vector testvector 1)
15
6
5
2
#f

> (binary-search-of-vector testvector 51)
15
42
50
#f

> (binary-search-of-vector testvector 24)
15
42
25
23
#f

> (binary-search-of-vector testvector 0)
15
6
5
2
#f
>


VII.  Vectors of vectors

Can the elements of a vector themselves be vectors? The answer is 
yes, and you'd get a vector of vectors, or what in other languages 
would be called a two-dimensional array. And as you might expect, a 
vector could have elements that are vectors, each of which has 
elements that are vectors giving a three-dimensional array. How many 
dimensions can you put on these things? As many as you want.

Let's say we wanted to make a simple multiplication table. You 
remember the multiplication table, don't you? One times one is one, 
one times two is two, and on and on. For purposes of this small 
example, we'll create a vector of five elements, each of which is a 
vector of five elements. This will give us a 5x5 array or table of 
cells. Let's make it like this:


(define mult-table (make-vector 5 ))           ;; this works
                                               ;; elements
                                               ;; are equal?
(vector-set! mult-table 0 (vector 0 0 0 0 0))  ;; but not eq?
(vector-set! mult-table 1 (vector 0 0 0 0 0))
(vector-set! mult-table 2 (vector 0 0 0 0 0))
(vector-set! mult-table 3 (vector 0 0 0 0 0))
(vector-set! mult-table 4 (vector 0 0 0 0 0))


Once we build this thing, we can take a look at it. DrScheme tells us 
that we have a five-element vector, and each element of that vector 
is a five-element vector:


Welcome to DrScheme, version 100alpha4.
Language: MzScheme.
>  mult-table
#5(#5(0) #5(0) #5(0) #5(0) #5(0))


We can think of this vector of vectors, or this array, as a 5x5 grid 
with a total of 25 individually-accessible cells, and we can think of 
each row of 5 elements as being a five-element vector in itself. (We 
could think of the columns of 5 cells as each being a five-element 
vector instead, but we've chosen to look at the rows that way.) The 
"vector" operations each initialized their respective vector elements 
to 0:


    col  0    1    2    3    4
row
       ____ ____ ____ ____ ____
      |    |    |    |    |    |
  0   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  1   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  2   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  3   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  4   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|


If we want to write a program to go through and initialize the values 
of our multiplication table, it's not difficult:

(define (initialize table rows columns)

   ;; extract each row
   (do ((currentrow '()) (rowcounter 0))
     ((> rowcounter rows) #t)
     (set! currentrow (vector-ref table rowcounter))

     ;; for current row, put mult values in each column
     (do ((columncounter 0))
       ((> columncounter columns) #t)
       (vector-set! currentrow columncounter (* columncounter
                                                rowcounter))
       (set! columncounter (+ columncounter 1)))

     ;; move to the next row
     (set! rowcounter (+ rowcounter 1))))


Welcome to DrScheme, version 100alpha4.
Language: MzScheme.
>  mult-table
#5(#5(0) #5(0) #5(0) #5(0) #5(0))
>  (initialize mult-table 4 4)
#t
>  mult-table
#5(#5(0) #5(0 1 2 3 4) #5(0 2 4 6 8) #5(0 3 6 9 12) #5(0 4 8 12 16))
>

And here's the logical representation of what that table looks like 
when we're done initializing:

    col  0    1    2    3    4
row
       ____ ____ ____ ____ ____
      |    |    |    |    |    |
  0   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  1   |  0 |  1 |  2 |  3 |  4 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  2   |  0 |  2 |  4 |  6 |  8 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  3   |  0 |  3 |  6 |  9 | 12 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  4   |  0 |  4 |  8 | 12 | 16 |
      |____|____|____|____|____|



And if we want to abstract away the initialization of a row, so we 
don't have to look at unsightly nested loops:


(define (initialize-once-more table rows columns)
   ;; extract each row
   (do ((currentrow '()) (rowcounter 0))
     ((> rowcounter rows) #t)
     (set! currentrow (vector-ref table rowcounter))
     ;; for current row, put mult values in each column
     (initialize-row currentrow columns rowcounter)
     ;; move to next row
     (set! rowcounter (+ rowcounter 1))))

(define (initialize-row currentrow columns rowcounter)
   (do ((columncounter 0))
     ((> columncounter columns) #t)
     (vector-set! currentrow columncounter (* columncounter
                                              rowcounter))
     (set! columncounter (+ columncounter 1))))

Welcome to DrScheme, version 100alpha4.
Language: MzScheme.
>  mult-table
#5(#5(0) #5(0) #5(0) #5(0) #5(0))
>  (initialize-once-more mult-table 4 4)
#t
>  mult-table
#5(#5(0) #5(0 1 2 3 4) #5(0 2 4 6 8) #5(0 3 6 9 12) #5(0 4 8 12 16))
>


Finally, if we want to actually multiply two numbers using our 
multiplication table, we just pass the two numbers (that will be used 
as indices) and the table itself to this procedure:


(define (multiply a b table)
   (vector-ref (vector-ref table a) b))


Welcome to DrScheme, version 100alpha4.
Language: MzScheme.
>  (initialize-once-more mult-table 4 4)
#t
>  mult-table
#5(#5(0) #5(0 1 2 3 4) #5(0 2 4 6 8) #5(0 3 6 9 12) #5(0 4 8 12 16))
>  (multiply 3 4 mult-table)
12
>  (multiply 4 3 mult-table)
12
>  (multiply 2 4 mult-table)
8
>  (multiply 4 2 mult-table)
8
>


If you decide to use Scheme to build multi-dimensional arrays, you 
should know that things are not always what they seem. Let's say we 
build our initial multiplication table like this instead:


(define table2 (make-vector 5 (make-vector 5 0))) ;; this doesn't
                                                  ;; work
                                                  ;; elements are
                                                  ;; eq?
>  table2
#5(#5(0))
>

    col  0    1    2    3    4
row
       ____ ____ ____ ____ ____
      |    |    |    |    |    |
  0   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  1   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  2   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  3   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  4   |  0 |  0 |  0 |  0 |  0 |
      |____|____|____|____|____|



When we initialize the table with the multiplication values that we 
want, we get something unexpected.

>  (initialize table2 4 4)
#t
>  table2
#5(#5(0 4 8 12 16))
>


    col  0    1    2    3    4
row
       ____ ____ ____ ____ ____
      |    |    |    |    |    |
  0   |  0 |  4 |  8 | 12 | 16 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  1   |  0 |  4 |  8 | 12 | 16 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  2   |  0 |  4 |  8 | 12 | 16 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  3   |  0 |  4 |  8 | 12 | 16 |
      |____|____|____|____|____|
      |    |    |    |    |    |
  4   |  0 |  4 |  8 | 12 | 16 |
      |____|____|____|____|____|



That's not at all what we were hoping for, is it? What happened here 
is that when we asked DrScheme to do this:

(define table2 (make-vector 5 (make-vector 5 0))) ;; this doesn't
                                                  ;; work
                                                  ;; elements are
                                                  ;; eq?


DrScheme made each of the five-element vectors within the larger 
five-element vector eq? instead of equal?. In other words, the 
five-element vector that we hoped would have another five-element 
vector in each cell instead has five pointers to the same 
five-element vector (initially filled with zeros). Personally, I 
find this a bit quirky, but I'm sure somebody has a really good 
reason.  Apparently it's why table2 looks like this when DrScheme 
prints it:


>  table2
#5(#5(0))
>

#5(#5(0) #5(0) #5(0) #5(0) #5(0))


Just more food for thought.



Copyright (c) 2003 by Kurt Eiselt.  All rights reserved, with 
the exception of stuff that belongs to somebody else.

Last revised: November 6, 2003