Dowser Structure


Table of Contents

Introduction

This note briefly described the software structure of the Dowser, with an emphasis on component portability.

Background

The Dowser is an attempt to integrate domain analysis into reverse engineering. The Dowser is designed to serve as a gathering point for, and interface to, various tools that perform both reverse engineering and domain analysis.

Principles

The main principle guiding the design and implementation of the Dowser is the set of tools approach, as advocated by, for example, David Notkin (Notkin and Sullivan, Reconciling Environment Integration and Component Independence, in the proceedings of the Fourth ACM SIGSOFT Symposium on Software Development Environments, pages 22-33, 1990), John Osterhout (Tcl and the Tk Toolkit, Addison Wesley: 1994), and the Open Systems Development Group (The ToolTalk infrastructure).

The set-of-tools approach uses two strategies. The first is to design many small tools, each of which does one thing well. This strategy is one used with success by the Unix operating system (Pike and Kernighan, Program Design in the Unix Environment, Bell System Technical Journal, vol. 63, no. 8, pages 1595-1605, October 1984). The second strategy is to design a framework or glue to enable individual tools to work together yet maintain their independence. Examples of such a framework include the code(send) primitive in Tk, which lets independently executing tools send commands to one another, and the ToolTalk inter-tool coordination infrastructure, which acts as a message exchange and notification service for independently executing tools.

Architecture

The Dowser's current organization is:
dowser architecture

The tools making up the Dowser tool set can be put in one of three classes. The Dowser tools are the tools with which the analyst interacts directly. The analysis tools are called on by the Dowser tools to perform various types of code and domain analyses as directed by the analysist. The support tools are used by the Dowser tools to provide long-term (that is, inter-session) coordination and storage. The coordinator is the framework for coordinating the Dowser tools.

The analysis tools are used predominantly for domain and code analysis. The Analysis tools are generic with respect to their input that they can be used on any type of source in cases where it makes sense to do so. Example domain-analysis tools include concordance generators and programs for doing n-gram analysis. Example code-analysis tools comprise tools from the SunPro Software System and various Icon, gnu awk and k-shell scripts. The SunPro tools we use are the C-C++ compiler CC, version SC3.0.1 02 Mar 1995, the Fortran 77 compiler f77, version SC3.0.1 13 Jul 1994, and the sbdump program, version 3.0.1 94/07/14. The compilers perform code analysis, generating binary files; the sbdump program translates binary analysis files into ASCII analysis files; and the remaining tools install the code-analysis data in the relational database.

The support tools are used to provide inter-session coordination and storage for Dowser analyses. The main support tool at the moment is the MySQL relational database version 3.21.30. The database is used to store program-analysis information and a small amount of domain information. Example tables included in the database for a program include a symbol table giving type, program location, and use of all symbols in a program, and a call table indicating where and which procedures are called in the program.

The Dowser tools themselves have the generic architecture illustrated by

dowser tools architecture
The Coordinator is the central repository for information about the Dowser tools. Each Dowser tool that runs registers with the Coordinator, which involves transferring information about the tool to the Coordinator. Dowser tools may query the Coordinator to get information about other tools in operation, and then may connect directly to these tools to send them commands. A Dowser tool may also start up another Dowser tool if a tool of the proper type isn't already running.

Portability

The portability issues are:

Tcl and Tk
The scripting languages Tcl and Tk are available on a wide variety of machines. The Dowser's using version 8.0.5 of Tcl and Tk. The Dowser cannot use earlier versions, and we are not anticipating an upgrade to version 8.1 anytime soon.

K-shell
K-shell is proprietary to AT&T. There are public domain implementations of k-shell, but we are unfamiliar with them. The Gnu bash shell is available on a wide variety of machines and claims to maintain compatibility with k-shell.

C and C++
The Gnu C and C++ compiler gcc is available on a wide variety of machines. In addition the C and C++ code is straightforward and compiled to ANSI standards; it should compile on other C and C++ compilers with no problem.

MySQL
MySQL is available on a number of machines, including Unix variants, but the Microsoft versions are for paying customers only.

SunPro Compilers
The SunPro Software System is proprietary to Sun Microsystems and is, as far as we know, not replicated in readily available software.


This page last modified on 17 June 1999.