5. Interpreting Table Output
Aristotle stores analysis results in your
database directory. You can view these results from
the menus using Aristotle's table viewers.
You can also view graphs from the menus using
Aristotle's graph viewers. The following
sections may help you interpret the analysis results
that are presented using these two mechanisms.
Table Contents Menu
For purposes of illustration, here is a simple
table containing control flow information for
a function, example.c, that contains
two subroutines, main and get:
Control flow information for file: example.c
Function: main
CFG root node: 1
node label endline cf-successors (node-id(edge-id,"edge-label")
----------------------------------------------------------------------
1 E 6 3(1,"")
2 X 14
3 D 7 4(2,"")
4 F 8 5(3,"")
5 C 9 6(4,"")
6 P 0 7(5,"T") 8(6,"F")
7 S 10 9(7,"")
8 S 12 9(8,"")
9 J 0 2(9,"")
Function: get
CFG root node: 1
node label endline cf-successors (node-id(edge-id,"edge-label")
----------------------------------------------------------------------
1 E 19 3(1,"")
2 X 24
3 D 20 4(2,"")
4 F 21 5(3,"")
5 S 22 2(4,"")
Although each table that you can view using
the table viewers contains different
information, there are a few characteristics
common to most tables:
- Most tables report information on a per-function basis.
That is, for each function in your source file, the
table lists a block of information for that function.
In the example, the table gives a block of information
for function main, followed by a block
of information for function get.
- Most tables contain (at least) columns labelled
"node", "label" and "endline".
The table in the example contains these three columns.
These columns have the following meaning:
You can view the following tables using Aristotle.
- Table type 1 - cfg (control flow graph information).
- The table given as an example at the top of this section
is a cfg information table.
The cfg information table contains
the three general columns described above, and
a fourth column called "cf-successors".
The cf-successors column lists, by node id,
the control flow successors of the
node named in the left hand column.
In other words, there is an edge in
the control flow graph from
the node in the left-hand column to
each "cf-successor node".
The column also lists, for each successor,
the unique edge-identifier for that edge,
and the label attached to that edge
(possibly an empty label).
- Table type 2 - cdg (control dependence graph information).
- This table contains the three general columns mentioned above, along
with a column called "cd-successors". The cd-successors column lists,
by node id, the control dependence successors
of the node named in the left hand column.
In other words, there is an edge in the control dependence graph from
the node in the left-hand column to each cd-successor node.
Some cdg edges have labels attached to them;
these labels are reported inside parentheses, following the node id in the
cd-successors column.
- Table type 3 - ddg (data dependence graph information).
- This table contains the three general columns mentioned above, and a column
called "dd-successors".
The data dependence successors column is divided
into two parts: a list of flow dependence successors and
a list of output dependence successors.
The column lists, by node id,
the nodes that are data dependent on the node in
the left hand column, along with (in parentheses) the
name of the variable that causes this dependence.
A similar list denotes output dependencies.
- Table type 4 - pdg (program dependence graph information).
- This table contains the three general columns mentioned above,
and a column called "successor-nodes"
There are three types of successor-nodes: control dependence (CD),
data dependence (DD), and control flow (CF) successors.
These successors are listed the same way as in the
tables for the control flow graph, control dependence graph, and
data dependence graph.
- Table type 5 - map (source-code-to-node mapping information).
- This table contains the three general columns mentioned above,
and columns called "type" "subtype" "strtline"
"strtrtl" and "endrtl".
The "type" field corresponds to the
single-letter label that appears in column two,
that is, the label is a single letter distillation of the type.
The subtype field further classifies the node type.
Subtypes are chosen from the following:
RETURN_STATEMENT DO_WHILE_CONDITION
EXIT_STATEMENT FOR_CONDITION
BREAK_STATEMENT ENTRY_REGION
CONTINUE_STATEMENT IF_CLAUSE_REGION
GOTO_STATEMENT ELSE_CLAUSE_REGION
OTHER_STATEMENT LOOP_TRUE_REGION
IF_PREDICATE LOOP_FALSE_REGION
WHILE_PREDICATE WHILE_HEADER_REGION
DO_WHILE_PREDICATE DO_WHILE_HEADER_REGION
FOR_PREDICATE FOR_HEADER_REGION
SWITCH_PREDICATE CASE_REGION
IF_CONDITION SUMMARY_REGION
WHILE_CONDITION DONT_CARE
The strtline column, in the future, may hold the
location of the first line on which a statement
appears, which for multi-line statements may
differ from the location of the last line.
Due to details of the implementation of the
parser, from which we determine these line
numbers, this field is currently set to 0.
The strtrtl and endrtl columns contain
information which is for the internal use of
the Aristotle system only.
For nodes not associated with code (e.g. region nodes)
these fields have value 0.
- Table type 6 - sym (symbol table information).
- At the beginning of this table, the global identifiers that appear in the file
are listed. For each identifier, a negative id number is assigned and appears
in the first column. The second column indicates whether the identifier is a
pointer, the third column indicates whether it is a static variable, and the
fourth column indicates whether it is externally declared. The last column
lists the name of the variable.
After listing global identifiers, this table reports information that is
local to each function in the source file.
For each function, the table lists the function name, and
any formal parameters used in the function. The table then lists the non-global
identifiers used in the function. The sub-table containing information about
non-global parameters has six columns. The first column lists the variable id,
the second column says whether the variable is a LOCAL variable or PARAMETER,
the third column indicates whether the variable is a pointer, the fourth column
indicates whether the variable is static, and the fifth column indicates whether
the variable is external. The last column lists the name of the variable.
Next, for the given function, the table lists call sites within the function.
For each call site, the table lists the node number of the call site, the last
line of the call statement, the procedure called, the node number of the
corresponding return node, and the parameters used in the call.
The last part of the symbol table information for each function lists exit site
information. For each exit site, the table lists the node number, the node
label (X for the Exit node, or S for Return or Exit statements), and the last
line of the exit statement.
- Table type 7 - du (local definition-use information).
- This table contains the three
general columns described above, and a column labeled
"identifiers defined/used (and their occurence-types)". This column lists,
by name, the variables that are defined and used in the statement
corresponding to the node named in the left-hand column.
In addition to identifying the variables defined and used at a statement,
this table also notes the types of uses and definitions that occurred
in that statement.
In the following explication, suppose
x, p, q, and r are scalar,
pointer, pointer- pointer variables, and, pointer- pointer- pointer
variables respectively;
that is, suppose a function func contains
the following declarations:
int x;
int *p;
int **q;
int ***r;
Suppose further that we are considering the
occurences of uses and/or definitions in
statement s in func.
Aristotle distinguishes the following
types of definition and use occurences.
- Scalar variable use (svu).
- If s uses the value of x,
then s has a scalar variable
use of x. For example,
the statement "*p = x"
has an svu of x.
- Scalar variable assignment (sva).
- If s sets the value of x,
then s has a scalar variable assignment
of x. For example,
the statement "x = *p"
has an sva of x.
- Indirect variable use (ivu).
- If s uses the value of *p
(the memory location pointed to by p),
then s has an indirect variable
use of p. For example,
the statement "x = *p"
has an ivu of p. IVU is used to represent
an arbitrary number of indirections. So, "x = **q" and
"x = ***r" are also an indirect variable use.
- Indirect variable assignment (iva).
- If s sets the value of *p
(the memory location pointed to by p),
then s has an indirect variable
assignment of p. For example,
the statement "*p = x"
has an iva of p. IVA is used to represent
an arbitrary number of indirections. So, "**q = x" and
"***r = x" are also an indirect variable assignment.
- Pointer variable use (pvu).
- Whenever s includes a pointer
dereference of p or pp,
s has a pointer variable use
of p or pp.
For example, the statement "*p = x",
in addition to having an iva of p,
also has a pvu of p.
- Pointer variable assignment (pva).
- Whenever s includes an assignment
to p itself, or to pp itself
(rather than to the memory location that
the variable refers to), s contains
a pointer variable assigment of p or pp.
For example, the statement "p = &x"
has a pva of p.
- Address use (au).
- Whenever s takes an address of
a variable, s contains an
address use of that variable.
For example, the statement "p = &x",
has an au of x.
For each variable used or defined at a node,
that variable's occurence types are listed,
following the variable, in parentheses.
- Table type 8 - rd (reaching definitions information).
- This table lists the three general columns described above,
with a fourth column
called "reaching definitions" that lists the variable definitions
that reach the node listed in the first column.
The reaching definitions column lists the node at which the
reaching definition was defined,
followed by a colon and the name
of the variable whose definition reaches.
- Table type 9 - src (preprocessed source file with line numbers).
- This table lists the source code from which
graphs are built, along with line numbers.
- Table type 10 - dom ((post)dominator trees and (post)dominator information).
- This table contains the three general columns described above,
and a column labeled "dom info", that lists information
on dominance and postdominance relationships in the control
flow graph for the source file. For each
node in the control flow graph for a source file,
eight types of (post)dominance information are listed:
- DOMINATES:
- A list of the cfg nodes that the node in the left-hand column dominates.
- DOMINATED BY:
- A list of the cfg nodes that the node in the left-hand column is dominated by.
- DTREE PARENT:
- The parent, in the dominator tree for the cfg, of the node in the
left-hand column.
- DTREE CHILDREN:
- A list of the children, in the dominator tree for the cfg, of the
node in the left-hand column.
- P-DOMINATES:
- A list of the cfg nodes that the node in the left-hand column postdominates.
- P-DOMINATED BY:
- A list of the cfg nodes that
the node in the left-hand column is postdominated by.
- PDTREE PARENT:
- The parent, in the postdominator tree for the cfg, of the node in the
left-hand column.
- PDTREE CHILDREN:
- A list of the children, in the postdominator tree for the cfg, of the
node in the left-hand column.
- Table type 11 - cg (call graph).
- This table reports call graph
information for a source file,
in five columns.
A "node id" column assigns a unique integer node identifier
to each node in the call graph (where each node represents a
procedure.) The "procedure" column lists the
name of the procedure associated with that node.
The "defined?" column contains either "YES" or "NO";
the former indicates that the procedure corresponding
to the node is present in the source file, the latter indicates
that it is external to (its source is not given in) the source file.
Note that such nodes are necessarily leaves in the call graph,
because we cannot determine the procedures that
their associated procedure calls.
The "calls procedure(s)" column lists, one per line, the names
of the procedures that are called by the procedure
associated with the node. The "node id" column
gives the associated node number of such called procedures.
When the procedure
contains no calls, this is indicated.
Graph Viewer Output
Aristotle uses the XVCG tool to display graphs.
XVCG is an X11 software package for Visualization of Compiler Graphs.
The Aristotle distribution includes an XVCG binary for Solaris
in the bin subdirectory. Since the bin directory should be in your path
for running Aristotle, XVCG should be able to run without any
further setup.
XVCG is distributed under the terms of the GNU General Public License and
is available by anonymous ftp at ftp.cs.uni-sb.de (134.96.254.254) in the
directory /pub/graphics/vcg. There is a web site in
Germany at the University of Saarland
(
http://www.cs.uni-sb.de:80/RW/users/sander/html/gsvcg1.html).
More information,
including source code and documentation, is available on the website.
Refer to XVCG documentation for usage details.
6. Definitions and Terms Used
This section contains a short list of the following program analysis
terms which are frequently encountered. The topics are:
6.1. Call Graph (CG)
A call graph models the calling relationships between entities in a program and is
useful for flow-insensitive analyses and program understanding. Source files may
contain calls to external routines, including standard library routines that cannot
themselves be analyzed for calling relationships. In such cases, the call graph
includes nodes for the called routines but marks them as unexpanded.