Project 2
Implementing a symbol table and using ANTLR to type check and transform
an abstract syntax tree
Due Monday, October 25 at 11:59pm, via turnin.
Part 2 of your project involves implementation of three components:
-
a symbol table,
-
a type checker, and
-
a source-level optimizer.
The language involved is the one you parsed in part 1 of the project.
The symbol table
You are to implement a symbol table with an interface similar to that defined
in section 8.1.2 of the book. You may choose either a binary search
tree or a hash table to implement the look-up mechanism of your symbol
table. Keep in mind that you might later be implementing one or more
features like records or classes that will require direct access to the
symbol table that contains the names of their components. Thus you
should create an implementation that uses a separate symbol table for each
scope.
The type checker
The type checker is to be implemented as an ANTLR tree walker that parses
the AST generated by a parser using your grammar from part 1. The
type checker will process the declarations first, putting appropriate information
about each name it encounters in the symbol table. It then walks
through the statement part of the subtree, looking up names in the symbol
table as it encounters them. Type information must be propagated
upwards through each expression tree; checking is performed on the operands
of each operation, on the expressions used in control structures (they
must be boolean), and on the source and target of each assignment statement
(they must be of the same type).
The output of the type checker is a "decorated" AST, so it functions
as a tree transformer.
The source-level optimizer
The source-level optimizer is also to be implemented as an ANTLR tree walker.
It must look for operations with constant operands and compute the result,
replacing the operation subtree with a tree node containing the result.
Thus the tree #(PLUS 3 5) is replaced by a simple constant tree node containing
8. The optimizer is to find such optimizations for all of the types
in the language (integer, boolean and string).
Like the type checker, this optimizer functions as a tree transformer.
Because they both take an AST as input and produce a modified version of
the AST, type checking and this kind of optimizing could be done in a single
pass through the tree. I strongly advise against combining them
into a single component. It will be hard enough to write the
transformation rules without combining both of these functions in a single
specification.
Hints
This is a great opportunity to practice incremental development.
Build the symbol table and check it out with driver code that you write
yourself; you don't need to get ANTLR involved with this step. When
you begin with the type checker, first build a tree parser that just parses
and echoes the tree it receives as input. Once you are successful,
save a copy of this grammar as a starting point for building the optimizer
before you start adding in the type checking capability. Add your
type propagation and checking code a few rules at a time, testing each
increment before you add more. Do the same with the optimizer.
You will save a lot of time when an error occurs if you only have to consider
a few rules as its possible cause. You will also have the advantage
of learning from any errors you do make.
Test cases
I have a number of test cases available now. For each test case, there
are two types of output. There is normal output, which is the bare
minimum of information you need for this project, and then the verbose
output, which shows you how nice your output could be. I recommend you
first look at the normal output, and then after that check out the
verbose output.
The test cases are located here.
The verbose output requires some explanation. In addition to a symbol
table for variables, I also have a symbol table for types, and one for
methods (operators). Since my AST and these tables have wacky
references to one another, I tried to show those references by giving
each symbol table entry an identifying symbol (like #1#), and then print
a symbol each time there is a reference to another structure. If you
have questions about how to interpret my verbose output, let me know.
Turnin information
For this project, you will turn in a directory of all your files, which
should contain
- A "README" file which contains:
- Your name
- Your CoC account name
- Which language you used (Java or C++)
- Summary of what works/doesn't
- Anything out-of-the-ordinary that you want the grader to know
You should use the README template.
- Two scripts, which do the following:
- All the source code
The command is
turnin prj2 directory_which_has_all_my_files
Failure to follow turnin instructions will result in a grade deduction.
Last updated on
Sun Oct 24 10:38:17 EDT 1999
by Brian McNamara