Intro to the course
~~~~~~~~~~~~~~~~~~~

How this course fits in the grand scheme of things

There's an old saying that goes like this:  "When all you have
is a hammer, everything looks like a nail."  If we apply the
saying to computing, it comes out like this:  "When all you know
is *programming language X*, everything looks like a problem
that should be solved using *programming language X*."  It's
common thinking throughout computing world, and it's especially
prevalent amongst relatively new computing professionals who
don't have a lot of background yet.  One of the goals of our CS
curriculum is to prevent our students from falling into that
trap, so we try to expose them to several different programming
and problem-solving paradigms in the first couple of years.  CS
2360 is one of the courses that introduces you to what for most
of you will be a new way of thinking about computing and
software development.


The four big topics

You'll have the opportunity to learn a lot of stuff in this
course, and that stuff can be grouped into about four different
categories, and we can give each one of those categories a name,
just to help us keep it all organized in our minds.


Software design

The first category of stuff falls under the category of good
software design principles.  Those of you who have been playing
with computers for a number of years may have some idea that
things like program size (smaller = better) and program speed
(faster = better) are the important attributes to look for when
evaluating program quality.  That was certainly true many years
ago when computers were very expensive.  Today however,
computers are cheap and getting cheaper, while at the same time
programmers are getting more and more expensive.  So the big
question in software development is no longer just how to best
conserve computer resources, but also how to best conserve human
resources.  Or, in other words, programmer efficiency is at
least as important, if not more so, than program efficiency.

But what exactly is programmer efficiency?  Is it just getting
the code written as quickly as possible?  That's certainly part
of it, but the biggest expense in software development is
incurred in debugging, maintaining, and revising code.  Thus
while we engage in the art or science of software development,
we should be striving not only for ease of design and
implementation, but also for qualities like readability,
debuggability (is that a word?), maintainability, revisability
(how about that one?), and reusability.  We can do this by
working toward controlling the complexity of our programs, and
we'll talk a lot about controlling software complexity in this
course.

So if you think your main concern as a programmer is "how do I
save some bytes?" or "how do I shave some cycles off my
execution time?", you need to put those ideas away until you
come across some case where those issues are important.  Don't
write code for the benefit of the computer; write for the
benefit of the people who build it, test it, evaluate it, debug
it, adapt it, and learn from it, and heed this quote from a
couple of really smart guys:

"...a computer language is not just a way of getting a computer
to perform operations ... it is a novel formal medium for
expressing ideas about methodology.  Thus, programs must be
written for people to read, and only incidentally for machines
to execute." (Abelson and Sussman)

The message you should take away from all this is clear:  more
and more, folks in computing are realizing that programs are
written for the benefit of people, not for the benefit of
computers.  If you don't clue into this soon, you're gonna be
one of the first ones against the wall when the revolution
comes.  (That should be familiar to Hitchhiker's Guide to the
Galaxy fans.)


Functional programming

The second category of stuff that we'll learn about is a programming
paradigm called "functional programming."  This paradigm is a contrast
to the "procedural"  (AKA "block-structured" or "imperative")
programming that you've been exposed to in CS 1501, or any other
introductory course in Pascal, FORTRAN, C, or similar programming
languages.  This paradigm also stands in contrast (although perhaps
less so) to the "object-oriented" paradigms that you are exposed to in
CS 1502 or 2390.

In the functional programming paradigm, we look at problems as things
to be decomposed into successively smaller sub-problems, just like we
would in any programming paradigm.  And of course we implement the
solutions to these small sub-problems with sets of instructions
organized into procedures, as we would in any programming paradigm.
But in the functional programming paradigm, there are a few simple but
significant restrictions that we place on our procedures, and typically
when we adhere to these restrictions we call our procedures
"functions".  (This nomenclature isn't entirely consistent in computer
land; you'll read more about this in some subsequent set of notes.)
The restrictions are these: (1) functions are procedures which access
only the parameters passed to them through the argument list and are
not influenced by any other external factors, so they behave just like
mathematical functions which you all are familiar with; (2) functions
always return the same result given the same input parameters; (3)
functions don't use side effects---a side effect is anything that a
procedure does that persists after the procedure is no longer being
executed---which means that functions don't employ assignment, among
other things; and (4) since there aren't any side effects and the
behavior of a function can only be influenced through the passing of
values through the argument list, the notion of a free or global
variable simply disappears.  This causes a great deal of concern for
students new to the paradigm; you'll get used to it.

What do these restrictions buy us?  They give us a way of
controlling, but not eliminating, complexity.

Complexity, as you know, is "the enemy", as in _the_ enemy of software 
engineers, so any tools we can use as weapons against it are good.

The restriction that functions communicate only through argument
lists and the prohibition on global variables make it easy to
see what factors influence the behavior of a function; there are
no mystery dependencies that make code hard to read and debug.
The principle that the external influences on a procedure are
tightly constrained and very explicit so that it's easy to see
how the procedure behaves is called "referential transparency".
It's an important principle, and every software developer would
do well to employ it more than she or he already does.

The prohibition on assignment within a function means that you
don't have variables that are changing value in the middle of
the execution of a function.  That helps a lot when you're
trying to debug that function.  Think about it...would it be
easier to debug some procedure with a bunch of variables that
change values during execution, or would it be easier to debug
some procedure where that sort of behavior just isn't allowed?
It's hard to argue for the former, isn't it?

The results of these small functions are then combined or
synthesized by other functions, which return those results up to
other functions, and so on, until the desired result is found.
There are some addtional advantages to writing computer programs
in this way:

1.  With a little practice, they're pretty easy to construct.

2.  They're especially easy to read.

3.  They're especially easy to debug.

4.  If you were really energetic, you could prove some properties, such 
    as correctness, about them.  In other words, such programs are 
    mathematically "clean".  (We won't have time to explore this 
    much, if at all, but when we move to semesters this topic will
    get much more exposure.)

The first three of those you'll have to take on faith for the moment,
but you'll discover for yourself in a few weeks or so.  

Functional programming isn't a cure-all.  Many problems don't
lend themselves to an easy solution by adhering to the
functional programming paradigm, so in the last couple of weeks
of the course we'll relax some of these functional programming
prohibitions.


Knowledge representation and processing

The third category of stuff is named after the course
itself--"knowledge representation and processing."  Here we'll be
working with ways of organizing and using information which are
different from the more traditional and linear approaches such as
records, files, tables, and the like.  The organizational techniques
come under the heading of "hierarchical data structures," and they
include things like lists, trees, and relational networks.  The methods
for processing data stored in such structures fall under the heading of
"search techniques," and we'll see some dumb, brute-force search
methods, as well as some smarter search methods.  The methods for
knowledge representation and processing that we'll cover in the course
have a long history of use in the field of artificial intelligence (AI)
and in databases (DB), but over time these same techniques have crept
into other subareas of computer science, so even if you have no
interest in AI or DB, this stuff will all still be useful to you in the
long run.

Oh, one other thing about this third category of stuff -- all
these techniques for organizing and processing information are
easily tackled using recursion, so you'll see what you might
think is an overly zealous emphasis on recursive techniques
during the first few weeks.  We're just trying to get you to use
the best tools for a given job.


LISP

The fourth and final category of stuff you'll learn about in
this course is the Common LISP programming language.  Well,
everything was going along just fine until we mentioned LISP,
eh?  You may have heard ugly things about LISP:  there are too
many parentheses, it's only for artificial intelligence, it's
hard to learn, and so on.  If you heard these myths at Tech, you
probably heard them from people who had the language jammed down
their throats in one or two weeks after having studied Pascal
for two years.  But these are in fact just myths, as we hope
you'll see by the end of the quarter.  LISP will be the language
used in this course, and there are lots of good reasons for
doing so:

1.  LISP started out as a purely functional programming language, and 
still retains much of that flavor.  At the very least, it still encourages a 
functional programming style, and that's going to help us in this course.

2.  LISP is especially good for processing complex hierarchical data 
structures using recursion.  In fact, it is so surprisingly good that those 
of you who have learned something about processing these sorts of data 
structures using pointers will find it hard to believe that it could be so 
easy (LISP does all the pointer management for you...really).

3.  LISP is the second oldest general purpose programming language in use 
today (FORTRAN is the oldest).  As such, LISP has the benefit of lots of 
people who have devoted lots of years to making it better.  This shows up in 
the form of some of the most sophisticated programming tools (editors, 
debuggers, etc.) in existence.

4.  Most programming languages are designed so that compilation is 
efficient, often at the expense of things like consistency and usability.  
LISP on the other hand was designed (and continues to evolve) to maintain 
mathematical consistency and usability, sometimes at the expense of making 
the computer work a little harder.

5.  Because of its age, LISP is incredibly well documented.

6.  LISP has a uniform simple syntax, and there is no distinction between 
program and data.  (The latter point is not important now, but it will 
become more important as the course progresses.)

7.  In its "normal" state, LISP acts like an interpreted language (although
the LISP you use more than likely built on an incremental compiler).
That is, code is executed immediately, and no separate batch compile stage 
is necessary (although batch compiling of LISP code is both possible 
and desirable).  Interpreted or incrementally-compiled languages are 
highly interactive and give immediate feedback, making LISP an 
increasingly popular tool for the rapid prototyping of complex software 
systems such as new language compilers, graphics systems, and user interfaces.

8.  LISP is highly extensible.  In other words, it's very easy to build 
new languages on top of LISP by adding functions.  This feature is in large 
part responsible for LISP's longevity.  As new programming paradigms emerged 
over time, many programming languages fell by the wayside, while LISP was 
extended to accommodate the new paradigms.

9.  The Common LISP standard makes LISP programs very portable.  
Portability lets you migrate an application from one platform to another 
without having to rewrite bunches of code.  This is an important thing for 
you budding software entrepreneurs to remember.

10.  LISP has automatic storage management.  Memory is allocated as it's 
needed on the fly, and memory that is no longer needed is "garbage collected" 
dynamically as well.  There's seldom a need to "declare" data structures in 
advance.

11.  LISP also has dynamic typing, which means you no longer have to make 
data type declarations.  LISP does type-checking and necessary conversions at 
run-time.  This feature and the previous one put LISP in a category of 
programming languages now called "dynamic languages."  Newer dynamic 
languages, such as Dylan, are generating a lot of interest in the world of 
computing; LISP is sort of the granddaddy of dynamic languages.

12.  Finally, LISP is the favored language for artificial intelligence 
work, at least in the United States, so if you're interested in AI, this is 
the language to know.  And even if you're not interested, AI is a required 
course for CS majors at Tech, and it's usually taught using LISP, so LISP 
is still the language to know.

Conclusion

If you were to go back over that list of reasons why LISP is a
really neat thing, you may see a common thread:  dynamic typing,
automatic storage management, extensibility, interpreted code,
extensive programming tools and the like are all computationally
expensive, but they all make the programmer's life easier.  In
the long run, those features and others should combine to allow
programmers to develop better software in less time, albeit
possibly at the expense of asking the computer itself to work
harder.  That's a trade-off, to be sure, but it's a good one,
and it's one you should be thinking about all the time as you
develop software, regardless of the particular programming
language or paradigm you happen to be using.  In fact, this
issue sounds sort of like one of those software design issues we
were stressing at the beginning of this lecture, doesn't it?
Coincidence?  I think not.

The key point here is this:  The goal of everything you do as a
computer scientist is to get the computer to adapt to the user
(and user in this context includes you, the programmer), not
vice versa.  Tools are supposed to make their users' jobs
easier, not harder, and computers and their programming
languages are certainly tools.  You don't want users to become
slaves to the computer; you want to make the computer do the
work.  That's why the computer was invented in the first place.

Lecture notes by Kurt Eiselt, 1998.
Minor changes / additions by Brian McNamara, 1998.
Last updated on Wed Jul 1 00:31:34 EDT 1998 by Brian McNamara