CS 2360 - January 30, 1997

Lecture 8 -- Data Abstraction


Data abstraction

By this time we should all be pretty familiar with the basic 
ideas behind abstraction.  And the type of abstraction that we've
been dealing with mostly is called procedural abstraction.  That
is, in building our procedures, we've been:

  Postponing worrying about the details
  Decomposition
  Putting as much distance as possible between the high-level
    conceptual ideas of what we're trying to do and the details
    of how it gets done

That last tidbit is pretty important.  What it says is that, in the
procedures we build, we're trying to separate the theory, the design,
the algorithm, etc., from the low-level implementation stuff as
much as we can.

Why is this good for us?  It aids in designability, maintainability,
adaptability, readability, debuggability, and all the other itys.  
But we also know that it's painful.  Why?  Let's face it...we're 
all used to slam-dunking code, and all this abstraction stuff is 
hard to think about if we're not used to it but we're being forced 
to use it.  Ouch!

Nevertheless, the positives outweigh the negatives here, so we 
continue abstracting away.  We get similar benefits when we 
abstract away the details of our data structures.  For example,
already we know that in LISP we can build linked-list data 
structures easily without worrying about the details of how those
structures are implemented.  And that makes our programming live
easier.  Those same simple tasks would be a lot more difficult
in Pascal or C, no?  LISP has done some of the abstraction for us.

So this puts us into another, but related, world of abstraction 
that's called data abstraction.  We want to be able to write our
programs in ways that focus on the high-level concepts about the
data while abstracting away the implementation details of the
data structures.

What else do we gain by employing data abstraction?  We get to
work with pretty pictures, and that actually makes our lives 
easier too.


Graphs -- your basic abstraction

There's an abstraction that computer science types use all the time.
It's called a graph, and you've no doubt seen graphs before.
A graph is just a collection of vertices and edges (or nodes and 
links, or nodes and arcs, or...).

Graphs without cycles have nice mathematical properties, but that's
material for a class on graph theory.  If we want to represent 
information being contained in the vertices, we draw circles at
the vertices and write the information in there.  That's another
abstraction, because that's now how it really looks in memory, but
it makes it easier for us to think about and play with:

And if we put orientations on the edges, we get something called
a directed graph:

This directed graph notation gives us an abstract representation
of a linked list in LISP, like '(a b c), which as you recall is a
previously-defined abstract data type in LISP.


Association lists

There's another data abstraction that's used frequently in LISP.  
It's called the association list, or a-list for short.  It's a 
list of two- (or more) element sublists, typically used as a sort
of lookup table.  Here's a LISP-level abstraction of an a-list:

((eiselt 1.9) (shackelford 2.2) (greenlee 2.0))

And here's a lower-level, box-and-pointer version of the same a-list:


     _______                 _______                 _______ 
    |   |   |               |   |   |               |   |  /|
    | | | --+-------------->| | | --+-------------->| | | / |
    |_|_|___|               |_|_|___|               |_|_|/__|
      |                       |                       |
      |                       |                       |
     \|/                     \|/                     \|/
     _______     _______     _______     _______     _______     _______
    |   |   |   |   |  /|   |   |   |   |   |  /|   |   |   |   |   |  /|
    | | | --+-->| | | / |   | | | --+-->| | | / |   | | | --+-->| | | / |
    |_|_|___|   |_|_|/__|   |_|_|___|   |_|_|/__|   |_|_|___|   |_|_|/__|
      |           |           |           |           |           |
   eiselt        1.9     shackelford     2.2       greenlee      2.0


What helps to make the a-list so predominant in LISP is the existence
of a predefined a-list operation called assoc.  (It's one of the
functions that you defined for yourself in the second homework
assignment)  The assoc function takes two arguments, a key and an 
a-list, and searches down the a-list for the sublist whose first 
element matches the key:

? (assoc 'shackelford '((eiselt 1.9)(shackelford 2.2)(greenlee 2.0)))
(SHACKELFORD 2.2)


Trees

Suppose we wanted to encode the following special kind of a graph
called a tree: 

We might use the following LISP representation if we wanted to encode
only the bottommost (leaf) nodes of the tree:

(((a b)(c d))((e f)(g h)))

If we had a tree and we wanted to encode all the nodes including the
interior (non-leaf) nodes, we might use this LISP representation:

(a (b (d e))(c (f g)))

to encode this tree: 
Another equally good representation for this tree might be:

(a (b (d (nil nil) e (nil nil))) (c (f (nil nil) g (nil nil))))

You could write the appropriate functions, maybe called something like
"get-left-subtree" and "get-right-subtree", which could be used by
other functions to carry out various manipulations or traversals of 
this tree.  These functions to get the left and right children could 
abstract away the details of the implementation (which might be one 
or the other of the above) and allow you to write your tree manipulation 
without concern for the implementation. 

Stay tuned...more next time.



Copyright 1997 by Kurt Eiselt.  All rights reserved.  The figures and
some parts of text are stolen directly from Ian Smith's notes for an
earlier offering of this same course.  Some of those notes were in
turn taken from my notes for an even earlier offering.  Evolution
at work, no?

Last revised: January 30, 1997