I. Static vs. Dynamic Languages
Certainly one of the things we've stressed in this class so far is the idea
that software development involves making informed choices, and that those
choices carry with them both costs and benefits. For example, we've
discussed more than once the trade-offs involved with functional versus
non-functional programming styles, even within the confines of a single
programming language. A choice between one type of programming language and
another will also have impacts on how you approach a given problem, what form
your solution will take, and what you'll be allowed to do in implementing
that solution. The more "mainstream" languages, which can be classified as
"static languages", are based on computing technologies from the 1950s and
1960s, and they tend to force programmers into the following paradigm:
They tend to be batch-compiled languages, meaning you
wait for the entire program to compile and link after
every change.
Development environments are usually based on ASCII
source files and command-line interfaces (although there
are some notable exceptions recently).
Compiled programs have little or no runtime type
checking, which usually results in a complete crash when
an error occurs.
Memory is managed manually--code has direct memory
references or pointers. Code is hard to write,
understand, and can create subtle, hard-to-fix bugs.
When managed manually, memory allocation is inefficient,
buggy, not portable, and hard to integrate with other
code libraries. This is why memory-related errors are
responsible for the majority of all bugs in a typical
static language program.
C, C++, and Pascal are examples of static languages.
Another class of languages, called dynamic languages, have grown
in popularity in recent years. These languages enable a different,
more interactive approach to programming:
They allow faster development and rapid prototyping.
They usually have dynamic type information--errors are
usually detected before execution, and when they do
occur, the errors are usually recoverable and do not
cause crashes.
They usually have automatic memory management, which
hides the details of memory management from the
programmer. Code is simpler, cleaner, more reliable,
and more reusable.
They usually have a shell or "listener" so the user can
execute code interactively (either via an interpreter or
an incremental compiler).
What languages are dynamic? Smalltalk (whose offspring, Squeak, many of you
will deal with next year) and Scheme (along with its granddaddy, LISP) are
dynamic langauges. Java is considered a dynamic language. A relative
newcomer to the world of dynamic languages, but one that is growing in
popularity by leaps and bounds, is called Python. We'll be talking about
Python for the next couple of days.
A big benefit of dynamic languages is that they tend to be much more
extensible than static languages. That is, when the needs of the programmer
change, the language can be extended to adapt to those changes. Static
languages are less flexible in that regard.
II. Beginning Python
Python is a dynamic, interpreted, object-oriented language with a syntax
that looks a lot like a more traditional imperative/block-structured language
yet behaves like Scheme in many ways. It was invented by Guido van Rossum
somewhere around 1989 or 1990, and is growing in popularity by leaps and
bounds. It's used commercially by the folks at Google as well as
Industrial Light and Magic, just to name a couple, and will also be
used by those brave souls who try out our new introductory CS course
for non-CS majors, CS 1315, in the spring.
And to file away in your collection of computing trivia, the name "Python"
does not refer to the snake. Guido is a Monty Python fan, and the name
of the language is a tribute. You don't have to be a Monty Python fan
to understand the language, but it sometimes helps when reading some of
the literature. (And if you don't find Monty Python funny, you should
spend some time in England. The humor will then start to make all kinds
of sense.)
Most of what follows is borrowed from "Python: Essential Reference" by
David M. Beazley which I highly recommend. I'm also very fond of "Core Python
Programming" by Wesley J. Chun. You can find out even more about Python
at www.python.org.
We'll get to how to create classes and objects in Python, but before we get
there we'll have to get comfortable with the simple stuff. So we'll start
with the details of things like data types and statements and operators, the
stuff that goes inside the classes, and then we'll talk about how to wrap all
that stuff inside the magic incantations that make classes, objects, and
methods. And before we start, here's one important disclaimer: We're only
talking about a simple subset of Python from here on. There's a lot more
Python to know than what we'll introduce in this class.
III. Identifiers
These are the names used to identify variables, functions, parameters,
classes, and so on. The can include letters, numbers, and the underscore
(_) character, but must always start with a non-numeric character. Special
symbols such as $, %, and @ are not allowed. Python is case-sensitive, so
foo is not the same as Foo or FOO.
IV. Comments
The "ignore everything from here to the end of the line" comment begins with
a #:
# here's one example
i = 0 # here's another
So the "#" in Python is like the ";" in Scheme. I haven't found a multiline
comment thing in Python.
V. Keywords
There are a bunch of words (28) that are reserved by Python. Don't use them
as identifiers.
and elif global or
assert else if pass
break except import print
class exec in raise
continue finally is return
def for lambda try
del from not while
VI. Data Types
Just like Scheme, Python has data types (integers, floating points,
strings, lists, and so on) but it worries about them for you, again just
like Scheme. Python is a dynamically typed language. You don't have to
declare variables or types in advance; Python determines type and allocates
memory at run time, just like Scheme.
VII. Expressions and operators
Scheme programs are constructed from compound expressions; when evaluated,
Scheme expressions return values. So things like numbers and symbols bound to
values (variables) and function calls are all expressions. On the other hand,
Python programs are constructed from sequences of statements, and statements
don't necessarily return values when executed. Python statements do however
contain simple expressions.
In Python, arithmetic and boolean expressions are written in infix notation,
where the operator sits between its operands, as opposed to Scheme's prefix
notation, where the operators sits in front of its operands. So the Scheme
expression
(and (< 25 (+ (* x x) (* y y))) (> x 0))
is written in Python as
(x*x + y*y < 25) and (x > 0)
In Python, 1 is boolean true and 0 is boolean false (as opposed to #t and #f
in Scheme). Any non-zero number, or any nonempty list, string, tuple, or
dictionary is interpreted as true. (What's a tuple or a dictionary? Just
more Python data types. We won't talk about them again.) That's not quite
like Scheme, but on the other hand, Python evaluates boolean expressions left
to right and stops when it doesn't need to evaluate any more (i.e., lazy
evaluation) just like in Scheme.
Here are some arithmetic operators that'll come in handy:
+ addition and String concatenation
- subtraction
* multiplication
/ division
% mod (remainder from integer division)
And here are some predicates (I haven't seen the word "predicate" in the
Python literature, but that's how you know them...in Python, these are just
considered to be more operators):
< (less than)
<= (less than or equal)
> (greater than)
>= (greater than or equal to)
== (equal)
!= (not equal)
not
and
or
VIII. Operator precedence
So when Python sees an expression to be evaluated, which operations are
performed first? That depends on the "precedence" of the operator. Operations
with higher precedence are performed first. Here's the order of precedence
for the operators above, with highest precedence at the top of the list:
function calls
-x, +x (unary minus, unary plus)
** (power)
* / % (mult, div, mod)
+ - (addition and subtraction)
< > <= >= == != (relational operators, equality)
not
and
or
So the expression
10 - 2 * 4
evaluates to 2, not 32, because the multiplication is performed before the
addition. You can override the precedence with the careful application of
(tadaa!) parentheses. The expression
(10 - 2) * 4
would then evaluate to 32. When things get confusing, always use parentheses
to make things more obvious to others as well as yourself.
All the infix operators above are left-associative. That is, if infix
operators of equal precedence from the list above are chained together in a
single expression, the leftmost operator has precedence (i.e., operations are
performed left to right if the operators have the same precedence).
IX. The assignment statement
As noted earlier, Python programs express computations as sequences of
statements, like the "procedural" style of Scheme programming we introduced
in recent weeks, instead of as nested or compound expressions. So you might
guess that a Python program is heavily reliant on side-effects, and that you
might see a lot of assignment statements (the equivalent of let and set!).
You'd be right on both counts. The most common form of Python statement is
the assignment statement. The basic form or syntax looks like this:
*var* = *expr*
where *var* is a legal Python variable name, and *expr* is legal Python
expression. So
x = 5
says that the variable named x has the value 5.
Note that unlike some languages you may be familiar with, a statement like an
assignment statement above doesn't need to end with a semicolon (;)...a
linefeed will suffice. You could add a semicolon
x = 5;
and you can separate statements on the same line with semicolons
x = 5; y = 10; z = 0
which would be the same as
x = 5
y = 10
z = 0
but Python favors linefeeds and indentation over punctuation marks like
semicolons.
If you have a line, expression, statement, or whatever that you want to carry
across to a new line and have the line feed ignored, you use the
line-continuation character (\):
a = 3*2 + 5*7 - \
6*8
X. Organizing your code
Python relies on indentation, not parentheses or curly braces or "begin...end"
to denote blocks of code, such as the bodies of functions, conditionals, or
loops. The amount of indentation used for the first statement of a block is
arbitrary, but the indentation for the whole block must be the same.
if x == 3:
y = 7
z = 4
else:
y = 6
z = 2
The above example works. So does the example below.
if x == 3:
y = 7
z = 4
else:
y = 6
z = 2
But this doesn't work:
if x == 3:
y = 7
z = 4
else:
y = 6
z = 2
because the statements in the else block don't have the same indentation.
If you have really short blocks that fit on one line, you can do this:
if x == 3: y = 7; z = 4
else: y = 6; z = 2
But you may be sacrificing readability.
If you want to indicate an empty body or block, just use the pass statement:
if x == 3:
y = 7
z = 4
else:
pass
Oh, you can use tabs for indentation instead of spaces, but the Python books
discourage this.
XI. Selection statements
You've already seen it, but just to be thorough, the standard "if" statement
syntax looks like this:
if *expression*:
*statement*
:
*statement*
else:
*statement*
:
*statement*
Note the colons. They have to be there. Linefeeds aren't sufficient in
those places. To handle multiple test cases (sort of like Scheme's cond),
you use "elif":
if x == 3:
y = 4
elif x == 5:
y = 6
elif x == 7:
y = 8
else:
y = 0
XII. Iteration statements
There are two standard iteration or loop constructs in Python. One is the
while statement:
while *expression*:
*statements*
The while statement executes is associated block of statements until the
*expression* evaluates to false:
x = 0
while x < 10:
print x*x
x = x + 1
This will produce the following output when executed:
0
1
4
9
16
25
36
49
64
81
The other iterative structure is the for statement, which iterates over the
members of a sequence (what's a sequence? a list for example...more about
that later, just think of a list of numbers for now). Here's an example that
does the same thing as the while example above:
for x in range (0,10):
print x*x
The range(i,j) function constructs a list of integers with values from i to
j-1. If the starting value is omitted, it's taken to be zero, so you could
rewrite the previous example as
for x in range (10):
print x*x
An optional third argument can indicate a step or stride:
for x in range (0,10,2):
print x*x
The 2 indicates that the increment is 2 instead of 1, so the sequence of
integers in the range is 0, 2, 4, 6, 8, and the loop prints
0
4
16
36
64
XIII. Functions
Let's say that maybe, just maybe, some of what we've been talking about all
semester has sunk in and you'd like to break your Python program into small
components called functions. How do you do it? You use the def statement:
def power(x):
result = x * x
return result
Then when you execute the following function calls:
print power(2)
print power(3)
print power(4)
you get:
4
9
16
When you create variables within a function, their scope is local, just like
with Scheme's let, but you don't need a separate function like let to make it
happen. You just do the assignment as with result in the example above. And
if you tried to print the value of result after printing power(4), you'd get
an error because result is undefined outside of the power function.
Hey, now that you know how to make a function, you know what comes next,
right? Recursion! And if we have recursion, factorial can't be far behind:
def factorial(n):
if n==0:
return 1
else:
return n * factorial(n - 1)
And when you execute the following:
print factorial(0)
print factorial(1)
print factorial(2)
print factorial(3)
print factorial(100)
you get:
1
1
2
6
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
Why, it's just like the old days, isn't it? Almost. Python puts an
arbitrary limit on recursion depth. You can change it, but there's no need
for now. And tail recursion doesn't solve the problem, because Python
isn't implemented along the lines of our nice simple (theoretical)
substitution model of evaluation that we talked about a couple of months
ago. But in case you're interested, here's what a tail-recursive
factorial function in Python looks like (if Python were properly
tail-recursive):
def factorial_tr(n):
return factorial_tr_help(1, 1, n)
def factorial_tr_help(product, counter, maxcount):
if counter > maxcount:
return product
else:
return factorial_tr_help(counter * product, counter + 1, maxcount)
Parameter passing in Python functions is pass by reference, so if a mutable
object (like a list) is passed to a function where it's modified, those
changes will be seen by the calling procedure.
Oh, there are lists, and lambdas, and map, and functions like append (which
mutates the data structures instead of manipulating pointers)...woohoo!!
XIV. Lists
Python has lists, but they're not like Scheme lists. They're more like Scheme
vectors, except that you can make them longer or shorter as needed. That is,
Python lists are dynamic and Scheme vectors are static. Like Scheme vectors
and lists, Python lists can consists of all kinds of different data
types...they don't have to contain all numbers or all strings or whatever.
Here's how to make a list of numbers:
a = [10, 20, 30, 40]
and you reference the individual components using indices into the list,
starting with 0, just like with Scheme vectors:
print a[1]
gives
20
You can add something to the end of an existing list using the append()
method (it's an object-oriented thing...we'll explain methods later):
a.append(50)
print a
[10, 20, 30, 40, 50]
Append is destructive...it mutates the data structure that it works on, it
doesn't just return a different pointer. That's a big difference. You can
concatenate lists using the + operator:
b = [1, 2, 3] + [4, 5, 6]
print b
[1, 2, 3, 4, 5, 6]
Concatenation is destructive too. There's no well-behaved cons. And since
there's direct access to every component in a list, there's no need for an
equivalent to car and cdr either (although b[0] returns the first element of
list b, and b[1:] returns everything but the first element...the Python gurus
suggest that you don't get in the habit of using them this way...you'll
forget that you're in a non-functional world and things will go
haywire...these are called lists, but they're not linked lists in the
Scheme sense).
Since you can get to the middle or end of a Python list as easily as the
beginning, you can insert things anywhere:
b.insert(2, "foo")
print b
[1, 2, 'foo', 3, 4, 5, 6]
And you can nest lists:
c = [[1, 2, 3], [4, 5, 6]]
print c[0]
[1, 2, 3]
print c[0][1]
2
And of course you can get the length of a list:
print len(b)
7
Naturally, you can iterate along a loop. For example, you can drive a
for loop by the list itself:
for i in b:
print i
1
2
'foo'
3
4
5
6
That's all for now. More next time.
Copyright (c) 2003 by Kurt Eiselt. All rights reserved, with
the exception of stuff that belongs to somebody else.
Last revised: December 2, 2003