Back to Home Page
The Educational Software Process
What is the ESP?
The Educational Software Process is
loosely based on the Personal Software Process (PSP) (Watts-Humphrey, ),
but has been revised and simplified for an educational environment.
It outlines a process and a framework for a novice programmer to develop
the skills and knowledge to be considered a competent, maybe even expert,
programmer. It currently consists of a series of prescriptive recommendations
and
Why use the ESP?
The current undergraduate curriculum of Georgia Tech's College
of Computing has a comprehensive coverage of computing knowledge.
Students develop a good base of programming languages, acquire some practice
in small and large scale design, and learn how to develop programs that
perform a task. However, we have observed that graduating students
have not necessarily acquired good programming skills and practices.
Frequently, their programs can't be reused, maintained, or extended.
In order to become a skilled programmer or software engineer, a student
must have a good understanding of proper programming practices, good design
skills, and good metacognitive skills that encourage reflection and introspection.
Our experience also leads us to believe that the critical
period in developing good skills occurs outside of the labs and classrooms.
Students don't spend enough time with a TA coach or mentor to get good
feedback on their performance.
Goals
The overall goals of ESP are to:
-
Train students to acquire good programming habits
-
Introduce students to the concepts of proper design as early
as possible
-
Improve students' processes in design
-
Provide metrics by which students can assess room for improvement
Sustain this process of improvement throughout the students'
education
Methods
ESP takes two approaches to developing
and improving programming skills. The first approach takes a lot of hard
experience and presents it to the students in a simplified and formalized
format. There are many techniques and tricks that expert programmers know
or do, but are usually acquired either by observing other expert programmers
or reconstructing them during some laborious programming task. The second
approach is a process-oriented approach that encourages students to reflect
on their programming activity during and after the task and requires them
to collect data about their own activities.
Levels
Because students have different levels of knowledge and development,
we have adopted Carnegie Mellon's approach used in the Capability Maturity
Model and outlined six levels that we expect students to progress through
while going through the curriculum. When implemented, students will
eventually make a self-assessment about their current level and once they
have met some criteria (not established yet) and have been evaluated by
some ESP guru, they are allowed to advance to the next set of processes.
Because the process acknowledges that skills can be difficult to acquire,
the advancement process is only loosely coupled to the academic year that
the student has completed. A student can potentially advance from
Level 3 to 5 in a year and a half, rather than the two years that the level
implies. Likewise, a student with very little interest in the top-level
views of software development may never progress to Level 4 by choice.
ESP Level 0 - Novice programmer
-
Completely new to formalized concepts in computing
-
Learning basic programming concepts
-
Learning basic data structures
-
Assignments focus on grounding in programming basics
-
Tracks estimated time and actual time at a simplified level
-
Feedback centered on understanding syntax and conceptual
difficulties
ESP0 is designed for students learning basic programming
constructs. ESP0 is designed with three purposes in mind. The first is
to accustom students to making predictions and taking metrics without interfering
too much with the task at hand. The second purpose is to make students
more aware of expenditure of effort. The third is to ensure that students
are spending the proper amount of time in planning and thinking through
a problem before actually writing code.
At this level, the student is asked to record time estimations
for the design, coding, and debugging stages. Then, the student will complete
the assignment. Afterwards, the student will record actual time spent for
the design, coding, and debugging stages.
ESP Level 1 - Introductory programmer
-
Learning lexical structures and syntax
-
Learning basic algorithms, basic programming skills, and
basic programming processes
-
Assignments focus on combining basic programming concepts
to create more complex structures
-
Tracks estimated time and actual time at an elementary level
-
Tracks errors inserted into the code at an elementary level
-
Creates simplified designs
-
Feedback centered on understanding of syntax, style, process
improvement, and conceptual difficulties.
ESP1 is designed for students learning how to program in
their first programming language. At this level, introductory students
are using a compiler or interpreter for the first time to run their code.
ESP1 was designed with two purposes in mind. The first is to introduce
students to good programming practices, such as programming style, debugging
skills, and code reviews. The second is to make students think about their
mistakes and learn from their mistakes.
There are four themes throughout ESP1. The first is to
ensure that a concept works the way you think it does. The second is to
break up a large problem into many of easier-to-solve smaller problems.
The third is to reduce complexity by keeping things as simple and easy
to understand as possible. The fourth is to plan what you are going to
do, complete your tasks in a consistent and methodical manner, and then
assess how you did.
At this level, the student is asked to keep a time recording
log more formalized than that in ESP0. Also introduced at this level is
a simplified version of the defect recording log, a simplified version
of detailed design, the code framework checklist, the code construction
checklist, the code review checklist, the debugging checklist, and process
review. Effort estimation (in terms of time) is based on the number of
routines in a module.
ESP Level 2 - Intermediate programmer
-
Learning more algorithmic and module design
-
Assignments focus on small program design
-
Tracks estimated time and actual time at an intermediate
level
-
Tracks errors inserted into the code at an elementary level
-
Creates more complete designs
-
Feedback centered on efficiency, completeness of design,
and process improvement
ESP2 is designed for students that have some experience with
coding. ESP2 focuses more on tracking time and errors, and on completeness
and efficiency of designs.
Introduced at this level are the High-Level Design Checklist,
High-Level Design Review Checklist, Unit Testing Checklist, and Integrated
Testing Checklist. The defect recording log is also augmented to record
where an error was injected.
ESP Level 3 - Experienced programmer
-
Predictable personal time and effort estimation across different
applications
-
Familiarity with basic software engineering practices including
configuration management, code reuse, and so on
-
Assignments focus on large program design and variety of
applications
-
Tracks estimated time and actual time at an intermediate
level
-
Tracks errors inserted into the code at an intermediate level
-
Creates more complete and sophisticated designs
-
Feedback centered on design quality, code efficiency, and
quality of code maintainability
ESP3 is designed for students with significant experience
with coding in multiple programming languages. ESP3 focuses on predictable
effort estimations, as well as complete and efficient designs. ESP3 also
focuses on software engineering processes (such as configuration management,
reusability, etc.).
At this level, the student is asked to make more well-defined
predictions. The effort estimation at this level is based on Lines of Code
(which is based on estimation and on past experience).
ESP Level 4 - Novice software engineer
-
Predictable personal time and effort estimation in group
context
-
Strongly familiar with good software engineering practices
-
Assignments focus on larger group projects that take a quarter
to finish and are completed in stages
-
Feedback centered on ability to mesh effort with schedule,
quality of design, quality of revision management, and code maintainability.
To be more clearly defined.
Level 5 - Introductory software engineer
-
Predictable group time and effort estimation across different
applications (based on ESP data from members)
-
Predictable size estimation for all projects
-
Strong emphasis on life cycle models, project design and
group postmortems
-
Assignments are multi-term projects or pieces of larger team
projects
Feedback centered on quality of design, coordination of team
effort, and ability to adhere to scheduled life cycle
ESP Programming Activities
All software engineering lifecycle models
have some series of stages required for a program to go from design to
completion. We've taken some of the basic elements of these models,
combined them with stages obtained from Watts-Humphrey's PSP, and added
one of our own. Some of these activities
-
Planning - consists of any activity
used to prepare the programmer to begin the development process.
This could also involve issues of resource and time allocation, obtaining
programming tools, preparing a workspace, or even downloading needed materials.
-
Requirements Development
- consists of any activity used to obtain program specifications and requirements.
This may be as simple as writing down all the parameters of a homework
assignment and as complicated as conducting user studies to determine all
the interface parameters.
-
Design - consists of any formal
or informal activity used to transform the program requirements into a
design framework for code to be entered in.
-
Design Review - consists of a
review process that analyzes the design of the program for completeness,
correctness, and consistency.
-
Coding - any activity that enters
code into a compilable form.
-
Code Review - an activity
conducted prior to compiling or testing that checks the code for any errors
that might have been injected during the coding phase.
-
Compile/Debugging - any activity
that takes the code to a working executable form. Also includes activities
that correct errors in the code that produce erroneous output.
-
Testing - consists of activities
that check the robustness of the code to exceptions and that verify it's
conformance to the requirements.
-
Process Review - an activity
conducted by the programmer or programmers to evaluate the effectiveness
of the process used to develop the program.
-
Research - consists of any activity
used to obtain or develop more knowledge about the program. It could
consist of looking up language references in a textbook, consulting a teaching
assistant about a bug, looking up the Java API at SunSoft, or writing small
test programs to confirm one's understanding of a programming concept.
Implementing the ESP
Summer 1997 Implementation
We introduced the ESP process into the Introduction to Programming
Course offered by the College of Computing at Georgia Tech, which used
Java as its base language. For this iteration of the ESP, we required
the students to track the following items: design, the amount of time to
complete an assignment, the kinds of errors they discovered during their
program, and a self-analysis form describing the general outcomes of their
experience. The forms were made available in Microsoft Word with
the expectations that students would be programming in Java on a Windows
machine and could alternate between both applications to fill in the information
when relevant.
The Design Form
The students were required to develop a formalized design
of their program. For the first programming assignment, they were
provided with a design of the program. For the second, they were
required to take a Java API and transfer the design to the forms.
For the third and fourth assignments, they were required to develop their
own design but were provided with an expert solution developed by the TAs.
The design form consisted of a top level page that described all the classes
that would make up the program and two pages for each class that outlined
all the types, variables, and methods. The design form also required
them to estimate the amount of time it will take to develop their program.
The Time Log
The students were required to keep a log that recorded the
amount of time spent in each activity of programming. The activities
of programming used here were Planning/Design, Coding, Code Review, Debugging/Testing,
Process Review, and Research. Because the program requirements were
already developed for the students and a schedule was imposed upon them,
we collapsed the planning and design activities into one activity.
Likewise, the debugging and testing phase were combined because the students
were provided with test files and answers and didn't have to generate their
own test sets. For the sake of simplifying the activity, students
were asked to only note the time spent debugging "nontrivial" errors.
Essentially, this category included errors that were not simply syntax
errors but logical, implementation, and design errors. The loose
definition employed by the TAs was any error that took longer than about
3 minutes to fix. The form was designed to prompt students through
the stages of the process and to make them think about where they were
in finishing their program.
The Error Log
By the fifth assignment (out of eight), the students were
required to record the kinds of errors they made while developing the program.
Again, we specified that they only track "nontrivial" errors. The
form was designed to guide them through a formal debugging process.
They were required to record how they found the error, where they found
it, how they fixed it, and how many times they had committed the error.
The Process Review Log
The Process Review was essentially a self-analysis form
designed to help students reflect on the experiences of their last assignment.
It asked them how long it took them to program the assignment, what kind
of design changes they made, what features they were unable to implement,
and to describe their top three problems in completing the assignment.
Students were required to submit these forms electronically
in addition to their assignment. No forms were required for the first
program. The forms represented a fraction of their grade and students
received full credit if the forms were submitted. They were not graded
on the quality of their submissions or on the content. We didn't
want to hold everyone to some standard that we did not have clarity about.
In other words, a good programmer may have been able to complete some assignment
in under X hours. However, it would be unfair to penalize a student
who recorded X+3 hours to complete because they did not take some "optimal"
time. We also wanted to remove any incentives to misreport the numbers
in hopes of better treatment from the TAs.
Results of the Implementation
-
Implementation of the Forms Poor - We encountered
numerous problems with using the forms in Microsoft Word format.
Essentially, they were a poor choice of medium to implement the ESP.
-
Problems with file translation and form - Students
had numerous difficulties making sure the files were in the correct format
when downloading or submitting the forms. There were some problems
with students who did not have Word as part of their program set on their
home computer. Some of them chose to use RTF format instead.
The TAs had some logistical problems with maintaining that many Word files
(some of them 40K in size due to the scripts and graphics used on some
of them, and given 4 forms, for 8 assignments, and 200 students) on the
class directory.
-
Additional Translation Effort - Because the Design
form didn't translate directly to Java, students were forced to essentially
type in their program design twice, making the assignment longer to complete.
-
Incomplete Forms - Students bypassed parts they didn't
understand or care about. Because the ESP was such a small fraction
of their total grade (maybe 5%), they would turn in the minimum amount
necessary to satisfy the TA, leaving many of the areas unanswered or empty.
For example, a Time Log would have some very optimistic times, showing
5 exact stages, in order. However, for someone to take three hours
coding then spending 30 minutes in testing would mean that they had divine
coding abilities. It meant that they didn't encounter any bugs at
all, typed their code in one shot, and were able to go straight to testing.
This information may have been true in abstract but it obscured the details
of the process.
-
Increased workload for the Teaching Assistants - The
Design Forms presented significant problems for both the students and the
TAs. If an object-oriented program required 10 classes, the TAs would
have to look through a minimum of 21 pages worth of design: 1 for the top
level design and 2 per class. Many TAs reported having to work an
additional 5-10 hours a week to compensate for these. The workload
had detrimental effects on morale and TA motivation.
-
Difficult to collect data - The combination of the
poor format and the variable reporting of the students made any data collection
very difficult and impossible to automate.
-
Students didn't distinguish between different programming
activities - In the Time Log, a good number of students didn't record
absolute stages. Instead they would claim to have spent 1 hour designing,
coding, and debugging a particular class, for example. There are
two inferences that can be made here. One is that the students did
not understand the programming process as described in class as being a
set of individual activities that follow on one another. The other
is that the novice programmer doesn't know how to make distinctions of
that kind yet. For example, if a programmer adds a class, that is
clearly a change in design. On the other hand, if the programmer
changes how an algorithm performs in the middle of coding, they are sort
of redesigning the program but not in the same way that the Design activity
describes. It's possible that those students saw an activity as all
stages that got a piece of code to work.
-
Granularity unclear - There were some heated arguments
from the students about how low the granularity needed to be when reporting
times or "nontrivial" errors. This especially became relevant for
programs that took longer than 8 hours. Does a 2 minute activity
have any more significance out of 4800 minutes? On the other hand,
if the program only took 5 hours, does the student or the TA learn anything
from the student's reporting of 1 hour spent in 5 activities? Lastly,
in order to report a fine granularity, the student has to disrupt the programming
activity many times, creating more work in the long run.
-
Students misreporting - For various reasons, students
misreported their data. One common form of misreporting had to do
with the post-program activity of filling in estimated times, rather than
exact times. This data, while useful for estimating average class
times, is almost useless for serving its original purpose; to guide the
individual to making more accurate time estimations and becoming more aware
of how they spend their time. Another form of misreporting had to
do with students attempting to meet someone's expectations. No direct
expectations were communicated. However, students believed that their
grades were affected by the numbers they reported. They would inflate
their times in hopes of getting more sympathy from the TA (usually when
the program didn't work) or they would estimate some optimal time and show
that they had completed their program within the optimal range in hopes
of getting better grades.
-
Data Analysis difficult - Given the confounds already
described, the data analysis already becomes a fairly difficult process.
Even after throwing out the outliers and seemingly bogus data, the following
problems came up. For example, students who reported doing code review
and research showed a significant difference in their ability to code "efficiently."
I use the term efficiently to mean that they spent a smaller percentage
of their time in making changes to their design or fixing code. On
the other hand, students who didn't do code review or research finished
their programs in less time, on average. The reason for this result
is that expert programs skipped those steps and finished their programs
rapidly. Weaker programmers would report doing those activities,
possibly out of desparation, but also showed times that were 1.4x the class
average. Either way, it would be difficult to claim that the ESP
served its intended purpose.
-
Teaching Assistants insufficiently prepared to support
ESP - We did not spend an adequate amount of time preparing the teaching
assistants to support the students in conducting their ESP activities.
This made it difficult for Teaching Assistants to provide feedback about
how students could be altering their programming behavior to make improvements,
based on the data they were observing.
-
Course did not integrate ESP sufficiently - We went
to great lengths to ensure that the ESP was integrated into the course
structure. However, the course structure didn't spend a lot of time
discussing the particulars of the programming process but rather on the
details of implementing the language. If the latter is unavoidable,
then we need to rethink how much of ESP can be deployed at this stage in
the curriculum.
Research Issues
-
It's unclear to the students as to whether there is any
real benefit to performing such activities. In general, it's
much more difficult to put effort towards preventing a potential negative
than towards achieving a positive. It's possible that the implementation
and delivery of the ESP needs to be geared more towards showing positive
benefit than some moral exhortations that make unsupported claims about
how much time will be saved or bugs avoided.
-
Did students that diligently followed the practices of
the ESP gained any real benefit? We don't know who these students
are. There is the additional problem that students likely to follow
such a process, as currently implemented, will also be the kind of students
who work hard on their assignments and will be likely to get good grades
based on the assessment criteria whether they use ESP or not. Also,
we have no processes in place to track their progress past this one course.
-
ESP disrupts the thought processes and programming flow.
In DeMarco and Lister's Peopleware, they cite the need for uninterrupted
work as one of the hard requirements for getting good work done.
The ESP, by necessity, calls for disruptions in the workflow to get the
student to think about how they are performing their activity at that time.
However, this simply frustrates novices who are trying to complete a deliverable
to achieve a grade. Is there a way to balance these two requirements
so that learning is encouraged without disrupting the activity? Or
that the activity is only disrupted to prevent students from acquiring
bad habits?
-
Optimistic linear processes. One of the criticisms
of the waterfall model of software development is it's linear flow when
most software development requires revisiting some of the earlier stages,
sometimes repeatedly. We observed a second problem. The distinctions
between when someone is coding to change a design, fix a bug, or do research
is very fine and sometimes ones only made by an expert. Are these
distinctions important to make and can they be made clear to a novice?
-
Needs to be integrated within course curriculum -
Given that programming, as outlined by the ESP, is described as a series
of skills as well as knowledge of practices, can they be taught in a classroom
environment or does some other kind of supplementary activity need to be
provided in conjunction with the course material to support the acquisition
of this information? If so, are there good heuristics for doing so?
Future Implementation Directions
-
Integrated tool for design and tracking processes as opposed
to forms - Currently, we are working on a tool that will enable students
to enter a design and obtain java code from it and a tool that will standardize
the recording and processing of the information entered by the students.
The hope is that the process will become more streamlined from using this
tool as opposed to using forms.
-
Integrated environment - Ideally, we want the ESP
recording tools to be transparent to the programmer and capture information
that would either be impossible or onerous to capture by a human.
The environment should consist of and do the following:
-
provide an editing, compiling, and debugging environment
for software development
-
provide design tools
-
track the time spent in design, coding, compiling, and testing
and provide information to the programmer on request
-
track major design changes (new methods, classes, variables)
-
track the kinds of compiler errors that the programmer encounters,
maybe retaining code that shows how the problem was solved the last time.
-
provide an automated process review that reports the total
time spent on the program, the longest time spent on any compiler error,
and asks questions for the programmer to reflect on their progress.
-
tools to catch undesirable programming behavior and prompt
the programmer to become aware of this.