CS 1321X - Lecture 29 - December 2, 2003

CS 1321X - Lecture 29

Object-Oriented Python (and some sorting too)


I.  Public versus private

(This lecture was given by Mitch; here are my notes for the
same material.)

In the previous lecture, we suggested that there may be a potential
problem with our Python simulation of my Coke Machine empire.
In particular, that numcans variable is pretty much accessible from 
everywhere. So any old method from any object can not only read what's 
in there, but write over it too.  This could lead to problems, 
especially in large programming projects.

In Python, by default, all attributes of an object are "public"...all 
attributes of a class instance are accessible without any restrictions.  As 
Beazley's "Python: Essential Reference" goes on to say:

  It also implies that everything defined in a base class is inherited and 
  accessible within a derived class.  This behavior is often undesirable in 
  object-oriented applications because it exposes the internal implementation 
  of an object and it can lead to namespace conflicts between objects defined 
  in a derived class and those defined in a base class.

  To fix this problem, all names in a class that start with a double 
  underscore, such as __Foo, are mangled to form a new name of the form 
  _Classname_Foo.  This effectively provides a way for a class to have 
  private attributes, since private names used in a derived class won't 
  collide with the same private names used in a base class.  For example:

  class A:
     def __init__(self):
        self.__X = 3       # Mangled to self._A__X

  class B:
     def __init__(self):
        A.__init__(self)
        self.__X = 37      # Mangled to self._B__X

  Although this scheme provides the illusion of data hiding, there's no strict 
  mechanism in place to prevent access to the "private" attributes of a class.
  In particular, if the name of the class and corresponding private attribute 
  are known, they can be accessed using the mangled name.

So in short, while Java for example enforces a distinction between public and 
private attributes, in Python everything is public.  With Python's help you 
can make stuff look private, but ultimately it's public anyway.  For our 
example, that's not so important, but in your future you may encounter 
situations where not having that real public/private distinction may lead you 
to choose some other OO language over Python.  That's ok...it's why we took 
the time to talk about this.

Let's get back to vending machines.  How do we simulate the purchase of a 
Coke? We must create a BuyCoke method. It'll be easy. It'll look like this, 
after we add it to the CokeMachine class:

class CokeMachine:
   def __init__(self):
      self.numcans = 20
      print "Adding another Coke machine to your empire"
   def BuyCoke(self):
      if self.numcans > 0:
         self.numcans = self.numcans - 1
         print "Have a nice frosty Coke!"
         print "%3d frosty cans of Coke remaining." % (self.numcans)
      else:
         print "Sorry, out of Coke."

Here are some helpful additions to SimCoke:

class SimCoke:
   def __init__(self):
      print " "
      print "Here's the Coke Machine Simulator"
      cs = CokeMachine()
      ee = CokeMachine()
      print "The cs machine has %3d frosty cans of Coke waiting for you." \
            % (cs.numcans)
      print "The ee machine has %3d frosty cans of Coke waiting for you." \
            % (ee.numcans)
      cs.BuyCoke()
      cs.BuyCoke()
      print "The cs machine has %3d frosty cans of Coke waiting for you." \
            % (cs.numcans)
      print "The ee machine has %3d frosty cans of Coke waiting for you." \
            % (ee.numcans)


And here's the output when we run it:


Here's the Coke Machine Simulator
Adding another Coke machine to your empire
Adding another Coke machine to your empire
The cs machine has  20 frosty cans of Coke waiting for you.
The ee machine has  20 frosty cans of Coke waiting for you.
Have a nice frosty Coke!
 19 frosty cans of Coke remaining.
Have a nice frosty Coke!
 18 frosty cans of Coke remaining.
The cs machine has  18 frosty cans of Coke waiting for you.
The ee machine has  20 frosty cans of Coke waiting for you.


Our Scheme-based Coke machine simulator let us load Cokes into the machine 
too. We can do the same thing here with the addition of another method to the 
CokeMachine class.  We'll call it LoadCoke:

def LoadCoke(self, loadcans):
   self.numcans = self.numcans + loadcans
   print "%3d cans of Coke added" % (loadcans)
   print "%3d cans of Coke now available" % (self.numcans)


That should be pretty straightforward...nothing weird here, except we're 
going to explicitly pass a parameter to LoadCoke.  Watch for it in the 
ever-growing SimCoke below.  Here's the whole thing:

class CokeMachine:
   def __init__(self):
      self.numcans = 20
      print "Adding another Coke machine to your empire"
   def BuyCoke(self):
      if self.numcans > 0:
         self.numcans = self.numcans - 1
         print "Have a nice frosty Coke!"
         print "%3d frosty cans of Coke remaining." % (self.numcans)
      else:
         print "Sorry, out of Coke."
   def LoadCoke(self, loadcans):
      self.numcans = self.numcans + loadcans
      print "%3d cans of Coke added" % (loadcans)
      print "%3d cans of Coke now available" % (self.numcans)


class SimCoke:
   def __init__(self):
      print " "
      print "Here's the Coke Machine Simulator"
      cs = CokeMachine()
      ee = CokeMachine()
      print "The cs machine has %3d frosty cans of Coke waiting for you." \
            % (cs.numcans)
      print "The ee machine has %3d frosty cans of Coke waiting for you." \
            % (ee.numcans)
      cs.BuyCoke()
      cs.BuyCoke()
      print "The cs machine has %3d frosty cans of Coke waiting for you." \
            % (cs.numcans)
      print "The ee machine has %3d frosty cans of Coke waiting for you." \
            % (ee.numcans)
      cs.LoadCoke(10)
      print "The cs machine has %3d frosty cans of Coke waiting for you." \
            % (cs.numcans)
      ee.LoadCoke(-20)
      print "The ee machine has %3d frosty cans of Coke waiting for you." \
             % (ee.numcans)
      ee.BuyCoke()

SimCoke()


And here's some output:

Here's the Coke Machine Simulator
Adding another Coke machine to your empire
Adding another Coke machine to your empire
The cs machine has  20 frosty cans of Coke waiting for you.
The ee machine has  20 frosty cans of Coke waiting for you.
Have a nice frosty Coke!
 19 frosty cans of Coke remaining.
Have a nice frosty Coke!
 18 frosty cans of Coke remaining.
The cs machine has  18 frosty cans of Coke waiting for you.
The ee machine has  20 frosty cans of Coke waiting for you.
 10 cans of Coke added
 28 cans of Coke now available
The cs machine has  28 frosty cans of Coke waiting for you.
-20 cans of Coke added
  0 cans of Coke now available
The ee machine has   0 frosty cans of Coke waiting for you.
Sorry, out of Coke.


II.  More Coke Machine stuff

As I grow my collection of Coke machines, I'd like to be able to find out 
easily just how many Coke machines I own. So what I need to do is to have the 
Coke machine constructor update some sort of counter for the whole class of 
Coke machines every time a new Coke machine is created. To do that, we'll 
introduce a variable that belongs to the class, not to any particular 
instance or object.  This is roughly the equivalent of a static variable in 
Java:  we're saying that the variable is always around so long as the class 
is around, regardless of whether instances (or objects) of the class have 
been created.

So we add the totalmachines variable up at the top of the definition of the 
CokeMachine class so that we don't confuse it with the methods associated 
with the objects that will be created, and then we make sure to increment 
this variable inside the __init__ method that will be executed every time a 
new Coke machine is created (i.e., every time we generate a new instance of 
the CokeMachine class):

class CokeMachine:
   totalmachines = 0
   def __init__(self):
      CokeMachine.totalmachines = CokeMachine.totalmachines + 1
      self.numcans = 20
      print "Adding another Coke machine to your empire"
   def BuyCoke(self):
      if self.numcans > 0:
         self.numcans = self.numcans - 1
         print "Have a nice frosty Coke!"
         print "%3d frosty cans of Coke remaining." % (self.numcans)
      else:
         print "Sorry, out of Coke."
   def LoadCoke(self, loadcans):
      self.numcans = self.numcans + loadcans
      print "%3d cans of Coke added" % (loadcans)
      print "%3d cans of Coke now available" % (self.numcans)


class SimCoke:
   def __init__(self):
   print " "
   print "Here's the Coke Machine Simulator"
   cs = CokeMachine()
   ee = CokeMachine()
   print "The cs machine has %3d frosty cans of Coke waiting for you." \
         % (cs.numcans)
   print "The ee machine has %3d frosty cans of Coke waiting for you." \
         % (ee.numcans)
   cs.BuyCoke()
   cs.BuyCoke()
   print "The cs machine has %3d frosty cans of Coke waiting for you." \
         % (cs.numcans)
   print "The ee machine has %3d frosty cans of Coke waiting for you." \
         % (ee.numcans)
   cs.LoadCoke(10)
   print "The cs machine has %3d frosty cans of Coke waiting for you." \
         % (cs.numcans)
   ee.LoadCoke(-20)
   print "The ee machine has %3d frosty cans of Coke waiting for you." \
         % (ee.numcans)
   ee.BuyCoke()
   print "There are %2d Coke Machines in your empire" \
         % (CokeMachine.totalmachines)

SimCoke()


Lo and behold, here's some output!

Here's the Coke Machine Simulator
Adding another Coke machine to your empire
Adding another Coke machine to your empire
The cs machine has  20 frosty cans of Coke waiting for you.
The ee machine has  20 frosty cans of Coke waiting for you.
Have a nice frosty Coke!
 19 frosty cans of Coke remaining.
Have a nice frosty Coke!
 18 frosty cans of Coke remaining.
The cs machine has  18 frosty cans of Coke waiting for you.
The ee machine has  20 frosty cans of Coke waiting for you.
 10 cans of Coke added
 28 cans of Coke now available
The cs machine has  28 frosty cans of Coke waiting for you.
-20 cans of Coke added
  0 cans of Coke now available
The ee machine has   0 frosty cans of Coke waiting for you.
Sorry, out of Coke.
There are  2 Coke Machines in your empire


III. Inheritance

Let's just say, for the sake of argument, that I'm introducing a new line of 
Coke machines that work like my older Coke machines, which I still want to 
keep because they're making money, but they have some additional features 
that the old Coke machines don't have. For example, maybe I want to keep 
track of how many cans of Coke a particular machine has sold, so I can ping 
it once in awhile to find out how much profit I've made on that machine.

I could create a new class of machine, called "CokeMachine2002", by copying 
all the code from the CokeMachine class and just adding my additional code. 
Or I could let just create the CokeMachine2002 class by telling Python that 
the CokeMachine2002 class takes on all the attributes and methods of the 
original CokeMachine class. When a class is derived from another class, that 
new class *inherits* the data structures and methods of the original class. 
You don't have to copy code; Python does the work for you. The original 
class is called the parent class, the base class, or the superclass. The new, 
extended class is called the child class or subclass or, in Python, the 
derived class. In general, the subclass has everything that the superclass 
has, plus additional stuff to make it more specialized than the superclass.

  "Inheritance means being able to declare a type which
   builds on the fields (data and methods) of a previously
   declared type. As well as inheriting all the operations
   and data, you get the chance to declare your own
   versions and new versions of the methods to refine,
   specialize, replace, or extend the ones in the parent
   class."
   (Peter van der Linden, "Just Java 2", p. 150)

Here's a simple example of how to extend a class. Let's start with just the 
basics. First, I tell Python that CokeMachine2002 inherits from CokeMachine 
by adding a parameter list in the class statement that lists all the classes 
that CokeMachine2002 will inherit from:

class CokeMachine2002(CokeMachine):
   pass

And that's all I need, at minimum, to extend the CokeMachine class. The
CokeMachine2002 class is now a clone of the CokeMachine class and can
be used in exactly the same way:

class SimCoke:
   def __init__(self):
      print " "
      print "Here's the Coke Machine Simulator"
      cs = CokeMachine()
      ee = CokeMachine()
      ie = CokeMachine2002()
      print "The ie machine has %3d frosty cans of Coke waiting for you." \
            % (ie.numcans)
      ie.BuyCoke()
      print "The ie machine has %3d frosty cans of Coke waiting for you." \
            % (ie.numcans)
      ie.LoadCoke(10)
      print "The ie machine has %3d frosty cans of Coke waiting for you." \
            % (ie.numcans)
      print "There are %2d Coke Machines in your empire" \
            % (CokeMachine.totalmachines)

And here's the output, just so you can see for yourself:

Here's the Coke Machine Simulator
Adding another Coke machine to your empire
Adding another Coke machine to your empire
Adding another Coke machine to your empire
The ie machine has  20 frosty cans of Coke waiting for you.
Have a nice frosty Coke!
 19 frosty cans of Coke remaining.
The ie machine has  19 frosty cans of Coke waiting for you.
 10 cans of Coke added
 29 cans of Coke now available
The ie machine has  29 frosty cans of Coke waiting for you.
There are  3 Coke Machines in your empire

Now where was I? Oh yes, I wanted to give CokeMachine2002 the ability to keep 
track of how many cans it has sold and report on the profit. CokeMachine2002 
should have already inherited a variable that keeps track of how many cans 
are in the machine...the one called "numcans". Still, I need to add a 
variable to keep track of how many cans have been sold. Then I can multiply t
hat number by the profit per can to get my profit from the machine at that 
time:

class CokeMachine2002(CokeMachine):
   def __init__(self):
      self.soldcans = 0
   def GetProfit(self, sellprice, cost):
      print "This machine has made %4.2d profit" \
            % (self.soldcans * (sellprice - cost))

There's a problem here though.  When I add a __init__ method here, it 
overrides the one that was inherited, so numcans never gets created and 
things blow up.  I need to make sure that the __init__ methodfrom the base 
class gets done too:


class CokeMachine2002(CokeMachine):
   def __init__(self):
      CokeMachine.__init__(self)
      self.soldcans = 0
   def GetProfit(self, sellprice, cost):
      print "This machine has made %0.2f profit" \
            % (self.soldcans * (sellprice - cost))


That's gonna work just fine when I run it, except that I end up with no 
profit because I'm not updating the number of cans sold when I execute 
BuyCoke.  So I have to redefine BuyCoke and make sure that I execute the 
BuyCoke method from the base class too:

class CokeMachine2002(CokeMachine):
   def __init__(self):
      CokeMachine.__init__(self)
      self.soldcans = 0
   def GetProfit(self, sellprice, cost):
      print "This machine has made %0.2f profit" \
            % (self.soldcans * (sellprice - cost))
   def BuyCoke(self):
      CokeMachine.BuyCoke(self)
      self.soldcans = self.soldcans + 1

Here's the whole thing, followed by some output:

class CokeMachine:
   totalmachines = 0
   def __init__(self):
      CokeMachine.totalmachines = CokeMachine.totalmachines + 1
      self.numcans = 20
      print "Adding another Coke machine to your empire"
   def BuyCoke(self):
      if self.numcans > 0:
         self.numcans = self.numcans - 1
         print "Have a nice frosty Coke!"
         print "%3d frosty cans of Coke remaining." % (self.numcans)
      else:
         print "Sorry, out of Coke."
   def LoadCoke(self, loadcans):
      self.numcans = self.numcans + loadcans
      print "%3d cans of Coke added" % (loadcans)
      print "%3d cans of Coke now available" % (self.numcans)


class CokeMachine2002(CokeMachine):
   def __init__(self):
      CokeMachine.__init__(self)
      self.soldcans = 0
   def GetProfit(self, sellprice, cost):
      print "This machine has made %0.2f profit" \
            % (self.soldcans * (sellprice - cost))
   def BuyCoke(self):
      CokeMachine.BuyCoke(self)
      self.soldcans = self.soldcans + 1


class SimCoke:
   def __init__(self):
      print " "
      print "Here's the Coke Machine Simulator"
      cs = CokeMachine()
      ee = CokeMachine()
      ie = CokeMachine2002()
      print "The ie machine has %3d frosty cans of Coke waiting for you." \
            % (ie.numcans)
      ie.BuyCoke()
      print "The ie machine has %3d frosty cans of Coke waiting for you." \
            % (ie.numcans)
      ie.LoadCoke(10)
      print "The ie machine has %3d frosty cans of Coke waiting for you." \
            % (ie.numcans)
      print "There are %2d Coke Machines in your empire" \
            % (CokeMachine.totalmachines)
      ie.GetProfit(0.75, 0.25)


SimCoke()


Here's the Coke Machine Simulator
Adding another Coke machine to your empire
Adding another Coke machine to your empire
Adding another Coke machine to your empire
The ie machine has  20 frosty cans of Coke waiting for you.
Have a nice frosty Coke!
 19 frosty cans of Coke remaining.
The ie machine has  19 frosty cans of Coke waiting for you.
 10 cans of Coke added
 29 cans of Coke now available
The ie machine has  29 frosty cans of Coke waiting for you.
There are  3 Coke Machines in your empire
This machine has made 0.50 profit

And now that I've finally made a profit, I'm going to simulate my
earning a fortune in quarters.  I'm almost a zillionaire...I can feel it.


IV.  Simple sorting

(The following stuff is close to what Mitch presented, except that
he started with bubble sort instead of insertion sort, and he
presented his stuff in Python instead of Scheme.  The time 
complexity of bubble sort is the same as insertion sort.
If I get a moment, I'll try to add the Python stuff, and 
bubble sort, in here. -- Kurt, Dec. 8)

Let's shift gears a little.  No, make that a lot.  One of the 
classic problems in computer science is to sort stuff...to 
take an unsorted sequence of data items and put them into some sorted 
order. Why sort stuff? Well, basically, it's to make searching for 
data items a whole lot easier, and searching for data is something 
that computers spend lots of time on. And as I'm sure you know from 
personal experience, it's easier to find something in an orderly 
environment than in a disorderly environment.  (Check my office 
sometime for an example of a disorderly environment.) Consider the 
phone book. It's relatively easy to find somebody's phone number, 
because the listings are sorted alphabetically on the (last) name of 
the person you're trying to find. But imagine if the phone book 
wasn't sorted...what if the names were randomly ordered? It would 
take a whole lot longer to find that special someone, wouldn't it? So 
that's why sorting is such a big deal. Additionally, there are things 
that happen in computer science that aren't exactly sorting, but they 
behave like sorting algorithms, so studying sorting has value that 
extends beyond sorting problems. But that's grist for another course 
later in your CS education, if you choose to go that far.

To make things a little more concrete, consider the problem of 
sorting the following list of eight numbers:

      7   3   1   4   8   6   2   5

Let's make it even more familiar:

     (7   3   1   4   8   6   2   5)

How could we sort this list? Well, the most intuitively obvious way 
to sort a list is to apply something that computer scientists like to 
call "insertion sort". We start with an unsorted list, and another 
empty list that will end up holding the sorted items. We take the 
first item of the unsorted list and insert it in the "right" place in 
the sorted list.


     unsorted list                  sorted list

(7 3 1 4 8 6 2 5)                    ()

Since the sorted list is empty on the first pass, this is pretty 
easy...we just put the item on the list.


(3 1 4 8 6 2 5)                      (7)


Now we take the next item off the unsorted list, and we successively 
compare it to each item on the sorted list, looking for the right 
place to insert it.


(1 4 8 6 2 5)                        (3 7)


We keep doing this until we run out of items on the unsorted list.


(4 8 6 2 5)                          (1 3 7)

(8 6 2 5)                            (1 3 4 7)

(6 2 5)                              (1 3 4 7 8)

(2 5)                                (1 3 4 6 7 8)

(5)                                  (1 2 3 4 6 7 8)

()                                   (1 2 3 4 5 6 7 8)


When that happens, the other list contains all the original elements, 
but now they're sorted!

Here's a Scheme implementation of insertion sort:


; insertion sort
;
; given an unsorted list and an empty list that will eventually
; hold the sorted items, repeat the following until the unsorted list
; is empty:
; 1) remove the first element from the unsorted list
; 2) traverse the sorted list from left to right one item at
;    a time, comparing them to the item removed from the
;    unsorted list
; 3) when you find an item in the sorted list that is less than
;    or equal to the item from the unsorted list, insert the item
;    from the unsorted list just after the item you stopped at
;    in the sorted list
;
; insertionsort expects a list of numbers passed via the parameter
; sortlist
;
; insertionsort returns a list of the numbers passed through sortlist
; after being sorted from lowest value to highest value, left to right

(define (insertionsort sortlist)
    (insertionsort-helper sortlist ()))

(define (insertionsort-helper sortlist result)
    (cond ((null? sortlist) result)
          (else (insertionsort-helper (cdr sortlist)
                                      (insert (car sortlist) result)))))

(define (insert item alreadysorted)
    (cond ((null? alreadysorted) (cons item ()))
          ((<= item (car alreadysorted))
           (cons item alreadysorted))
          (else (cons (car alreadysorted)
                      (insert item (cdr alreadysorted))))))


Just out of curiosity, what do you suppose is the time complexity for 
insertion sort? Well, assuming we want worst-case complexity (which 
is often what people are asking for when they ask about complexity), 
we need to ask what's the cost of inserting an item into a sorted 
list. And while we're at it, we need to decide what the unit of cost 
should be. Last week, we counted conses. We could do that again, or 
we could count a single comparison (i.e., is this thing less than, 
equal to, or greater than that thing?) as a unit of cost. I'll leave 
it up to you to convince yourself that in this problem, counting 
conses and counting comparisons will work out to the same thing, more 
or less...at least, they'll be in the same order of complexity, which 
is all we really care about for now.

So how many comparisons happen when inserting the first unsorted 
element into the unsorted list? Since the unsorted list is empty, 
there are zero comparisons. How about for the next unsorted element? 
In the worst case, there is one comparison. How about for the next 
unsorted element? In the worst case, there are now two comparisons, 
because the sorted list has two elements. This continues until we run 
out of unsorted items. That happens at the time when there are n-1 
sorted items on the sorted list, so the number of comparisons on this 
pass, in the worst case, would be n-1. So the worst-case total number 
of comparisons would be 0 + 1 + 2 + ... + n-2 + n-1, which is 
n*(n-1)/2. You've seen a pattern like this before, haven't you? So 
the time complexity, in terms of comparisons performed in a 
worst-case insertion sort, is O((n^2-n)/2), which reduces to O(n^2-n) 
because we don't care about constants like 1/2. And O(n^2-n) reduces 
to O(n^2) because n^2 grows much faster than n as n gets really big, 
so we can ignore the n term. Our answer, therefore, is O(n^2).

Unfortunately, O(n^2) is kind of undesirable. Remember, this big-O 
stuff doesn't give us an exact count of the number of comparisons or 
conses or whatever for a given procedure on a given input. What it 
gives us is a useful approximation of how fast the amount of work 
grows in proportion to increases in the size of the input. So what we 
can deduce from O(n^2) is that as the size of the problem grows, or 
in this case as the size of the list to be sorted grows, the amount 
of work performed grows in direct proportion to the square of the 
size of the problem. And don't forget that average-case behavior or 
best-case behavior may be very different from worst-case behavior. In 
the case of insertion sort, best-case time complexity (which occurs 
when the unsorted list is the reverse of the desired sorted 
list...prove it to yourself at home) is O(n) (prove that to yourself 
too!).

Of course, you don't have to do insertion sort using lists and 
recursion.  You could use vectors and iteration.  Here's one 
sample incarnation:

;; this procedure creates an output vector with same size
;; as input vector (to be sorted) and fills the output 
;; vector with non-numeric junk
;; then the procedure iterates over the items in the 
;; input vector and calls insert

(define (insertionsort sortvector)
  (do ((result (make-vector (vector-length sortvector) '*))
       (max (vector-length sortvector))
       (index 0))
    ((>= index max) result)
    (insert (vector-ref sortvector index) result max index)
    (set! index (+ index 1))
    (display result)  ;; this is just to watch what happens
    (newline)))       ;; ditto

;; this procedure takes the item to be inserted in the
;; sorted list, along with the sorted list so far, and
;; finds the place where the item should be inserted.
;; then this procedure calls insert2
;; this procedure works by side-effect on result,
;; so what's returned doesn't matter

(define (insert new-item result max inputitemindex)
  (do ((index 0))
    ((>= index max) 'nothing) ;; insert works by side-effect
    (cond [(or (not (number? (vector-ref result index)))
               (< new-item (vector-ref result index)))
           (insert2 new-item index result max inputitemindex)
           (set! index max)]
          [else
           (set! index (+ index 1))])))

;; this procedure iterates right-to-left (backward,
;; whatever) through the sorted list, moving every
;; item one slot to the right until it has moved
;; the number pointed at by index, which opens
;; up a space for the new item to be inserted.
;; this procedure works by side-effect on result,
;; so what's returned doesn't matter
;; Note: this procedure has been optimized some, so
;; as not to move things it doesn't need to...
;; ultimately the insertion sort algorithm is
;; still O(n^2) in the worst case.

(define (insert2 new-item index result max inputitemindex)
  (do ((leftpointer index)
       (rightpointer inputitemindex))  ;; if we use max instead of
                                       ;; inputitemindex from way up
                                       ;; above, we'll move asterisks
                                       ;; we don't have to, but it's
                                       ;; not gonna change our Big-O
    ((= rightpointer leftpointer) (vector-set! result leftpointer new-item))
    (vector-set! result rightpointer (vector-ref result (- rightpointer 1)))
    (set! rightpointer (- rightpointer 1))))

The verbosity of Scheme's do structure makes the code a little bit
ugly, but it should run faster than the previous version since we're
not building lists...we've eliminated the expense of the conses.  But
if we're counting comparisons, this version still has time complexity
of O(n^2).

And of course we  could do the whole thing in Python, taking advantage
of Python's much cooler iterative forms as well as built in 
methods for inserting into the middle of a list and appending
to the end:

def InsertionSort(inputlist):

  result = inputlist[0:1]          # seed the result list with the
                                   # first item from input list 
  print result
  for item in inputlist[1:]:       # begin with the second item from input list
    InsertItem(item, result)
    print result
  return result                    # this function returns the sorted list  
                                   # without munging the input list


def InsertItem(item, result):
  for i in range(0, len(result)):  # look at everything in result and
    if item < result[i]:           # find place to insert item

      result.insert(i, item)       # insert the item in that place

      return result                # what's returned is unimportant as
                                   # this procedure works with side effects
                                   # on result
    else:
      pass
  result.append(item)              # if the item isn't inserted somewhere
                                   # then stick it on the end of the list


Here are some test cases:

print "start test1"
test1 = [6, 2, 4, 1, 3, 5]
print test1
print InsertionSort(test1)
print test1
print " "
print "start test2"
test2 = [6, 5, 4, 3, 2, 1]
print test2
print InsertionSort(test2)
print test2
print " "
print "start test3"
test3 = [1, 2, 3, 4, 5, 6]
print test3
print InsertionSort(test3)
print test3
print " "
print "end"

Here's the output:

start test1
[6, 2, 4, 1, 3, 5]
[6]
[2, 6]
[2, 4, 6]
[1, 2, 4, 6]
[1, 2, 3, 4, 6]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6]
[6, 2, 4, 1, 3, 5]
 
start test2
[6, 5, 4, 3, 2, 1]
[6]
[5, 6]
[4, 5, 6]
[3, 4, 5, 6]
[2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6]
[6, 5, 4, 3, 2, 1]
 
start test3
[1, 2, 3, 4, 5, 6]
[1]
[1, 2]
[1, 2, 3]
[1, 2, 3, 4]
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6]
 
end

But now we're back to manipulating lists and pointers, and that's
expensive, and in terms of comparisons, it's still gonna have
time complexity of O(n^2).  Could we speed this up?  Sure, we
could translate the vector-based Scheme approach above into
a Python version where we treat lists as arrays, and avoid
list operators like insert and append.  But still, it's O(n^2).
That's the nature of insertion sort.  No matter how you optimize
things, as n, the number of things to be sorted, grows, the amount
of work to be done grows along the lines of n^2.  That's what this
Big-O stuff is all about.  It's not about comparing different 
implementations of the same algorithm, it's about describing the
behavior of a class of algorithms as the input grows.


V.  Bubblesort

Mitch talked about bubblesort in class.  It's an interesting
algorithm because it's not immediately obvious that this approach
could get stuff sorted, and the code can be really compact.
However, this approach too has a time complexity of O(n^2).
So algoritm complexity isn't about how much code there is,
it's still only about how much work gets done in terms of
the cost measure that you use.

Mitch's explanation looked like this:

  Traverse an (unsorted) collection of elements from front
  to back or left to right

  "Bubble" the largest value in the list to the end or right
  using pairwise comparisons and swapping.

  You keep repeating the two steps above until the collection
  is sorted...at most, if there are n things to be sorted,
  you'll have to traverse the collection n-1 times.

Here's an example from Mitch's slides.  Start with this collection:

  77   42   35   12  101    5
   ^

We start our first traversal at the left.  77 is greater than 42, so
we swap:

  42   77   35   12  101    5
        ^

77 is greater than 35 so we swap again

  42   35   77   12  101    5
             ^

77 is greater than 12 so we swap again

  42   35   12   77  101    5
                  ^

77 is less than 101 so we leave 77 where it is and
resume again with trying to bubble 101 up or to the right:

  42   35   12   77  101    5
                       ^
101 is less than 5 so we swap those values

  42   35   12   77    5   101
                             ^

Now we're at the end of the list.  We've completed one traversal,
and we observe that the biggest value in the list has bubbled
to the top or right.  One item is sorted, and n-1 items remain
to be sorted, so we have to traverse the list again, but we only
need to traverse n-1 elements.  After the next pass the list looks
like this:

  35   12   42    5   77  101

And then the process continues like this:

  12   35    5   42   77  101

  12    5   35   42   77  101

   5   12   35   42   77  101

There were 6 items in the list, and 6-1 or 5 traversals.  The
analysis of bubblesort works out just like with insertion sort.
In a list of n items, on the first pass you'll do n-1 comparisons
and at most n-1 swaps.  On the next pass you'll do n-2 comparisons
and at most n-2 swaps.  And so on.  Add 'em all up and you get
(n - 1) + (n - 2) + (n - 3) + ... + 1 comparisons/worst-case-swaps.
Have you seen that before?  Sure, it's the same series that we
saw with insertions sort above.  Looks like O(n^2) again.

Here's a Scheme-based approach to bubblesort on a vector of
numbers...if you want to bubblesort a list of numbers, feel
free to provide your own code:

(define (bubblesort invector)
  (do ((i 0)
       (max (- (vector-length invector) 1)))
    ((>= i max) invector)
    (traverse invector)
    (print invector)
    (newline)
    (set! i (+ i 1))))

(define (traverse invector)
  (do ((j 0)
       (max (- (vector-length invector) 1)))
    ((>= j max) invector)
    (cond [(> (vector-ref invector j) (vector-ref invector (+ j 1)))
           (swap invector j (+ j 1))]
          [else 'nothing])
    (set! j (+ j 1))))

(define (swap invector i1 i2)
  (let ((temp (vector-ref invector i1)))
    (vector-set! invector i1 (vector-ref invector i2))
    (vector-set! invector i2 temp)))

Here's a little bit of output:

> (bubblesort (vector 6 5 4 3 2 1))
#6(5 4 3 2 1 6)
#6(4 3 2 1 5 6)
#6(3 2 1 4 5 6)
#6(2 1 3 4 5 6)
#6(1 2 3 4 5 6)
#6(1 2 3 4 5 6)  ;; this one is returned, not printed
> (bubblesort (vector 5 1 4 6 2 3))
#6(1 4 5 2 3 6)
#6(1 4 2 3 5 6)
#6(1 2 3 4 5 6)
#6(1 2 3 4 5 6)
#6(1 2 3 4 5 6)
#6(1 2 3 4 5 6)  ;; this one is returned, not printed
> 

Bubblesort is much more concisely expressed in Python.  Here's
Mitch's version with one teeny weeny modification so that it
doesn't do any more work than necessary:

def bsort(array):
  for i in range(0,len(array)-1):
    for j in range(0,len(array)-1):
      if array[j] > array [j+1]:
        swap(array, j, j+1)

def swap(array, i1, i2):
  temp = array[i1]
  array[i1] = array[i2]
  array[i2] = temp


VI.  Using tree recursion as a better sort of sorting strategy

So now the question is "can we do better than O(n^2) with sorting?" 
The answer is yes, and the trick is that you have to think about the 
sorting problem differently. Here's another way to sort a list: To 
sort a list, you cut the list into two smaller lists of say equal 
length. Call them the lefthalf and the righthalf. Now you sort the 
lefthalf and the righthalf independently. The results should be that 
you get back two sorted lists. You then merge those two lists in such 
a way that the result is one sorted list containing all the elements 
of the two smaller sorted lists. In other words, to sort a list, you 
cut the list into halves, sort the halves, and then put the results 
back together into a sorted list. How do you sort the halves? Well, 
you just cut the halves in half, sort those halves (now quarters), 
and merge the results. And how do you merge the quarters? It should 
be obvious by now.

Here's a more graphical way of looking at the same problem-solving 
approach. Again, we start with the unordered list:


                     (7  3  1  4  8  6  2  5)


To sort this list, we split the list into two halves, and take a leap 
of faith that if we sort the each of the two halves and then merge 
the results together, we'll get a sorted list:


                     (7  3  1  4  8  6  2  5)
                                 ^
                                 |
                               merge
                      ^                      ^
                      |                      |
                (7  3  1  4)           (8  6  2  5)


Then to sort those two lists, we split each of them in half, sort 
them, and merge the results:


                     (7  3  1  4  8  6  2  5)
                                 ^
                                 |
                               merge
                      ^                      ^
                      |                      |
                (7  3  1  4)           (8  6  2  5)

                      ^                      ^
                      |                      |
                    merge                  merge

               ^            ^          ^             ^
               |            |          |             |
            (7  3)       (1  4)      (8  6)        (2  5)


We keep splitting, sorting, and merging until we get down to lists 
with only one element, because those lists are already sorted for us!


                     (7  3  1  4  8  6  2  5)
                                 ^
                                 |
                               merge
                      ^                      ^
                      |                      |
                (7  3  1  4)           (8  6  2  5)

                      ^                      ^
                      |                      |
                    merge                  merge

               ^            ^          ^             ^
               |            |          |             |
            (7  3)       (1  4)      (8  6)        (2  5)
               ^            ^          ^             ^
               |            |          |             |
             merge        merge      merge         merge
             ^   ^        ^   ^      ^   ^         ^   ^
             |   |        |   |      |   |         |   |
            (7) (3)      (1) (4)    (8) (6)       (2) (5)


Note that, on the way down, we haven't really sorted anything yet! 
All we did was keep splitting our lists until we got down to 
one-element lists, and each of those is sorted by definition. The 
real "sorting" will happen in the merges that are performed from this 
point on. For example, when we merge the two lists (7) and (3), the 
merge process will produce the sorted list (3 7). The other merges 
will produce (1 4), (8 6), and (2 5):


                     (7  3  1  4  8  6  2  5)
                                 ^
                                 |
                               merge
                      ^                      ^
                      |                      |
                (7  3  1  4)           (8  6  2  5)

                      ^                      ^
                      |                      |
                    merge                  merge

               ^            ^          ^             ^
               |            |          |             |
            (3  7)       (1  4)      (6  8)        (2  5)
               ^            ^          ^             ^
               |            |          |             |
             merge        merge      merge         merge   <- start
             ^   ^        ^   ^      ^   ^         ^   ^      here
             |   |        |   |      |   |         |   |
            (7) (3)      (1) (4)    (8) (6)       (2) (5)


We now merge those two-element lists to get two sorted four-element lists:


                     (7  3  1  4  8  6  2  5)
                                 ^
                                 |
                               merge
                      ^                      ^
                      |                      |
                (1  3  4  7)           (2  5  6  8)

                      ^                      ^
                      |                      |
                    merge                  merge   <- now do these

               ^            ^          ^             ^
               |            |          |             |
            (3  7)       (1  4)      (6  8)        (2  5)
               ^            ^          ^             ^
               |            |          |             |
             merge        merge      merge         merge
             ^   ^        ^   ^      ^   ^         ^   ^
             |   |        |   |      |   |         |   |
            (7) (3)      (1) (4)    (8) (6)       (2) (5)


Finally, we merge the two four-element lists to get one 
eight-element, entirely sorted list:


                     (1  2  3  4  5  6  7  8)
                                 ^
                                 |
                               merge             <- now we merge
                      ^                      ^      this one
                      |                      |
                (1  3  4  7)           (2  5  6  8)

                      ^                      ^
                      |                      |
                    merge                  merge   <- now do these

               ^            ^          ^             ^
               |            |          |             |
            (3  7)       (1  4)      (6  8)        (2  5)
               ^            ^          ^             ^
               |            |          |             |
             merge        merge      merge         merge
             ^   ^        ^   ^      ^   ^         ^   ^
             |   |        |   |      |   |         |   |
            (7) (3)      (1) (4)    (8) (6)       (2) (5)


This is an example of a problem-solving strategy called "divide and 
conquer": make the problem into smaller problems, solve the smaller 
problems, and join the results. You've been doing this all along, you 
just didn't know it had a name. And the particular sorting algorithm 
that implements this divide and conquer strategy is called mergesort.


VII.  Mergesort in Scheme

Here's an English description of the mergesort procedure we illustrated above:

To sort a list, you cut the list into two smaller lists of say equal 
length. Call them the lefthalf and the righthalf. Now you sort the 
lefthalf and the righthalf independently. The results should be that 
you get back two sorted lists. You then merge those two lists in such 
a way that the result is one sorted list containing all the elements 
of the two smaller sorted lists.

When I look at this description and think about how to design a 
mergesort program, I see at least four different procedures that are 
named in the description: a sorting procedure, a procedure to return 
the lefthalf of a list, a procedure to return the righthalf of a 
list, and a procedure for merging two sorted lists.

And when I start to think about that sorting procedure, I want to 
call it mergesort, since that's the name of the sorting algorithm I'm 
trying to implement.  That top-level procedure does exactly what the 
English description says: to mergesort a list, merge the results of 
calling mergesort on the lefthalf of the list and the righthalf of 
the list:


(define (mergesort sortlist)
    (cond ((null? sortlist) ())             ;; in case someone tries to
                                            ;; sort the empty list

          ((null? (cdr sortlist)) sortlist) ;; making sure we don't split
                                            ;; a one-element list and
                                            ;; recurse forever

          (else (merge (mergesort (lefthalf sortlist))
                       (mergesort (righthalf sortlist))))))


Figuring out what the left half of a list is involves finding the 
length of the list, dividing by 2, and then peeling off that many 
elements of the list and returning those elements as a list:


(define (lefthalf sortlist)
    (lefthalf-helper sortlist (floor (/ (length sortlist) 2))))

(define (lefthalf-helper sortlist endcount)
    (cond ((= endcount 0) ())
          (else (cons (car sortlist) (lefthalf-helper (cdr sortlist)
                                                      (- endcount 1))))))


Figuring out what the right half of a list is involves finding the 
length of the list, dividing by 2, and then peeling off that many 
elements of the list and throwing them away, returning what's left 
over:


(define (righthalf sortlist)
    (righthalf-helper sortlist (floor (/ (length sortlist) 2))))

(define (righthalf-helper sortlist startcount)
    (cond ((= startcount 0) sortlist)
          (else (righthalf-helper (cdr sortlist) (- startcount 1)))))


By using the same number for counting off list elements, we minimize 
the potential for arithmetic errors in lefthalf and righthalf, which 
could in turn give us erroneous left halves and right halves.

Finally, merging two sorted lists is not unlike putting together the 
two halves of a zipper, except that you have to do some comparisons 
to maintain sorted order as you merge:


(define (merge list1 list2)
    (cond ((null? list1) list2)
          ((null? list2) list1)
          ((<= (car list1) (car list2))
           (cons (car list1) (merge (cdr list1) list2)))
          (else
           (cons (car list2) (merge list1 (cdr list2))))))


VIII.  Analysis of mergesort

Here's the mergesort program all together:


;; mergesort is the top-level function...it takes a list of numbers
;; to be sorted, splits the list into two equal-sized (plus or minus 1)
;; sublists, calls mergesort on the two sublists recursively, and then
;; calls merge to merge the resulting sorted sublists.  mergesort stops
;; recursing on a list with only one element, which by definition is
;; a sorted list.

(define (mergesort sortlist)
    (cond ((null? sortlist) ())             ;; in case someone tries to
                                            ;; sort the empty list

          ((null? (cdr sortlist)) sortlist) ;; making sure we don't split
                                            ;; a one-element list and
                                            ;; recurse forever

          (else (merge (mergesort (lefthalf sortlist))
                       (mergesort (righthalf sortlist))))))



;; lefthalf returns a list of the first n elements of a list, where
;; n is floor(length of the list/2)

(define (lefthalf sortlist)
    (lefthalf-helper sortlist (floor (/ (length sortlist) 2))))

(define (lefthalf-helper sortlist endcount)
    (cond ((= endcount 0) ())
          (else (cons (car sortlist) (lefthalf-helper (cdr sortlist)
                                                      (- endcount 1))))))


;; righthalf returns a list of all but the first n elements of a
;; list, where n is floor(length of the list/2)

(define (righthalf sortlist)
    (righthalf-helper sortlist (floor (/ (length sortlist) 2))))

(define (righthalf-helper sortlist startcount)
    (cond ((= startcount 0) sortlist)
          (else (righthalf-helper (cdr sortlist) (- startcount 1)))))

;; merge takes two sorted lists as arguments and merges the two lists
;; into a single list while retaining the sorted order...duplicate
;; elements are retained

(define (merge list1 list2)
    (cond ((null? list1) list2)
          ((null? list2) list1)
          ((<= (car list1) (car list2))
           (cons (car list1) (merge (cdr list1) list2)))
          (else
           (cons (car list2) (merge list1 (cdr list2))))))


We started down this road because we were looking for a sorting 
algorithm that gave us better time complexity than insertion sort's 
O(n^2), so you're probably expecting that mergesort gives us that 
better time complexity. Good guess. (And remember, "algorithm" is 
just another way of saying "a set of instructions which when executed 
solves some problem or does something useful". It's pretty much 
interchangeable with the word "procedure" in this course, although we 
sometimes use the word algorithm when we want make note of the 
separation between the high-level approach to solving a problem 
(algorithm) and the actual implementation in some specific 
programming language (procedure). However, you should also know that 
there's another aspect of the definition for "algorithm" that we're 
ignoring here: by definition, an algorithm is guaranteed to halt. 
It's an important theoretical distinction, but you don't need to 
worry about it for the time being. Now back to our regularly 
scheduled lecture notes.) Let's analyze the complexity of the 
mergesort algorithm.

We start by noticing that when it split the original unsorted list 
into two lists, the algorithm had to deal with n elements---at the 
very least, to split the eight-element list into two four-element 
lists, the algorithm had to peel off the first four elements, or n/2 
elements. And just to make life simpler, let's not worry about 
low-level details like how many conses were performed...let's analyze 
the algorithm independently of how things are done in Scheme, so for 
our unit cost we'll just count how many times a list element is 
"handled" by the algorithm (you can convince yourself that the 
analysis would still hold up if we counted conses, although the 
constants will change some).

What about at the next level of splitting? Well, to split two 
four-element lists into four two-element lists, the algorithm had to 
peel off the first two elements of each of two lists, and that's four 
elements again, or n/2 elements. In fact, everytime the algorithm 
splits all the lists at a given level of splitting, it has to handle 
n/2 elements. So the work done at each level of splitting is O(n/2), 
and since the constant isn't all that important to us, we say the 
time complexity at each level is O(n).

Once the splitting is all done, what's the time complexity of 
merging? Well, when you merge eight lists into four lists, at worst 
you handle every list element. And when you merge four lists into 
two, again you handle every list element, in the worst case. So it's 
easy to see that the time complexity at every level of merging is 
O(n).

So at every level of doing either splitting or merging, the 
complexity is O(n). More accurately, it's O(2n), but of course the 
constant once again is thrown away. Now we have to ask how many 
levels of splitting or merging there are. If the number of levels 
remains constant regardless of the n, then we'll be able to throw 
that away and declare that we have a sorting algorithm that has O(n) 
worst-case time complexity, and that would be worthy of a doctoral 
dissertation or two. On the other hand, if the number of levels is 
somehow proportional to n, then we may have to include that in our 
analysis. Oh, in case you were waiting to have your Ph.D. bestowed on 
you now, rest assured that the number of levels does not remain 
constant.

How the number of levels is affected by n may not be immediately 
obvious to you...it's the kind of thing that mathematicians get paid 
to notice. And what they'd notice is this: when n = 8, we see 3 
levels of splitting and 3 levels of merging. What if n = 4? You can 
see from the figures above that it takes 2 levels of splitting to get 
from one four-element list to four one-element lists, and not 
surprisingly there are 2 levels of merging to get back to a sorted 
four-element list. The same figures show you that if n = 2, there 
will be 1 level of splitting and 1 level of merging. In other words, 
every time we divide n by 2, we reduce the number of levels by one. 
What if we doubled n? It won't take you much work to convince 
yourself that if n = 16, there will be 4 levels of splitting and 4 
more of merging. In other words, it looks like

    2^1 list elements results in 1 level of splitting and 1 level of merging
    2^2 list elements results in 2 levels of splitting and 2 levels of merging
    2^3 list elements results in 3 levels of splitting and 3 levels of merging
    2^4 list elements results in 4 levels of splitting and 4 levels of merging
    and so on

Or, in fewer words, if n = 2^k, then there are k levels of splitting 
and k levels of merging. If you know anything about how logarithms 
work, then you know that the base 2 logarithm of 2^k is k. And if you 
don't know about base 2 logarithms, that's just a way of saying that 
k is the exponent that we have to raise 2 to in order to produce 2^k. 
So the base 2 logarithm of 16 is 4, and the base 2 logarithm of 8 is 
3, and the base 2 logarithm of 4 is 2, and so on. We can shorten that 
text to read like this:

log 16  = 4
      2

log 8  = 3
     2

log 4  = 2
     2

In general then, when there are n elements in the unsorted list that 
mergesort begins with, there will be

log n levels of splitting and
     2

log n levels of merging.
     2

When we're dealing with issues of computational complexity and Big-O 
notation, we ignore bases in the same way that we ignore other 
constants...we're really only interested in the variables when we're 
analyzing how the complexity changes as the size of the problem 
grows. So we can say that there are log n levels of splitting and log 
n levels of merging for an n-element list, or O(2 log n) levels 
altogether. Again, we eliminate the constant and say that there are 
on the order of log n levels of splitting and merging. Since there 
are on the order of n elements handled at each level, and there are 
on the order of log n levels of handling, we can then say that the 
worst-case time complexity for mergesort is O(n) * O(log n), which we 
can combine as O(n log n) time complexity for mergesort. And O(n log 
n) is pretty much the best worst-case time complexity you're going to 
find for any sorting algorithm, so mergesort turns out to be pretty 
darn good.

Somebody once asked why we don't just throw away the log n part 
and call it O(n) complexity. We could throw away the log n part if 
the complexity were analyzed to be O(n + log n), because log n 
doesn't grow very much as n gets big, so adding log n to n doesn't 
change the complexity much as n grows. We keep the dominant term n 
and toss everything else. But when we're multiplying n times log n, 
that's different. Even though log n grows slowly, it does grow, and 
as n increases to very big numbers, O(n log n) diverges greatly from 
O(n), so ignoring the log n in the product n log n would give us 
estimates of complexity that would be way off base for large values 
of n.

The analysis of algorithm performance is a big part of what good 
computer scientists do, or should be doing, when creating software. 
You'll see more of this sort of stuff as you continue with your 
career in computer science, assuming that's your career choice. Until 
then, you should know at least that there are different sorts of 
classic time-complexity behavior that you can expect. We can 
summarize them in the following table, adapted from the book 
"Algorithmics" by David Harel:

                 O(K) or O(1)     constant time               good

                 O(log N)         logarithmic time

 polynomial      O(N)             linear time
     time
                 O(N logN)        N log N time

                 O(N^2)           N-squared or quadratic time

                 O(N^3)           cubic time

                 O(N^K)           etc.

-------------

                 O(2^N) or O(K^N) exponential time

 exponential     O(N!)            factorial time
     time
                 O(N^N)           forget it                   bad


In general, anything that falls in the realm of polynomial time is 
considered a reasonable algorithm. Even if the numbers get really big 
as N increases, those numbers are relatively small in the 
mathematical arena. On the other hand, any algorithm that exhibits 
exponential time complexity (or space complexity for that matter) is 
considered to be unreasonable or intractable.

Still, you have to keep in mind that we've been working with 
worst-case scenarios all along, and you really need to consider 
average-case and best-case performance as well. While you may come up 
with an algorithm that has unreasonable worst-case complexity, it may 
have really good average-case complexity. And if the worst-case 
scenario doesn't occur very often, or if you can recognize when it 
does and do something else instead, then your algorithm may be pretty 
good after all. Being able to do this kind of analysis well requires 
lots of practice, of course, and a little bit of art thrown in with 
your science. It's fun stuff once you get the hang of it.



Copyright (c) 2003 by Kurt Eiselt.  All rights reserved, with 
the exception of stuff that belongs to somebody else.

Last revised: December 11, 2003