Visualizing Concurrent Programs


Concurrent Programs

Concurrent programs are by nature large and complex and may produce vast quantities of data. Instead of a single thread of execution, concurrent programs have multiple threads of execution which communicate with each other, compete for shared resources, periodically synchronize, and may be dynamically created and destroyed. Execution may be non-deterministic. Debugging is complicated with this problem with reproducibility.

There are more obstacles in collecting data with concurrent programs: producing a snapshot of program state is more complex in a concurrent program; memory may be distributed, and messages in transit may be difficult to access; system clocks may not be synchronized across the multiple processors involved in the computation, and may drift at different rates.

Graphical visualization can be a powerful tool for understanding and explaining complex tasks such as parallel computation. Appropriate displays of concurrent programs can help the viewer develop intuition about performance and correctness problems that stem from unanticipated interactions between processes. In the following sections, we discuss the various inherent taks involved in creating a visualization of a parallel program: data collection, analysis, and display.

Data Collection

Instrumentation is one of the main techniques used to collect data which is then analyzed and visualized. Any form of instrumentation will perturb the system to some degree. The extent of perturbation varies according to the volume of information collected, the monitoring method, the system architecture and system software, the level at which the instrumentation occurs, and the programming paradigm. Further, the programmer's tolerance for perturbation of the system will vary with the task being performed. Programmers performing debugging tasks will tolerate greater performance losses, as their main main interest is correctness rather than performance.

Instrumentation can be done at various levels: hardware, operating system, run-time environment, application program, etc. With each of these levels of instrumentation, one can obtain various levels of information: at the hardware level, one can obtain process CPU times, program counter samples, cache misses, etc; at the operating system level, one can obtain information such as messages sent and received, process creation, scheduling, etc.; run-time environment instrumentaion can provide information such as state of various run-time queues, the acquisition and release of locks, procedure calls and returns, etc.; application level instrumentation can provide information about abstract, high-level, and user-defined events.

Data collection for concurrent systems is more complicated than for sequential systems. Concurrent programs tend to be long running and produce large amounts of data. It is hard to determine globally consistent states as the data is distributed across separate memories.

On-line versus post-mortem visualization: On-line viz. can provide up-to-the-moment view of the computation's progress and can reduce the overhead of storage. However, the viz. cannot be too detailed. Post-mortem viz. provides the opportunity for a more detailed display than can proceed at a user-specified pace, and generally will perturb the program to a lesser extent. Filtering and reduction of data might have to be used to reduce the storage overhead.

Collection of data from concurrent programs involves multiple streams of data and these streams must be appropriately ordered and merged for analysis and visualization. Lamport defined a consistent ordering of events in a distributed system in terms of the happened-before relationship.

Analysis

The analysis phase involves processing the data collected, which maybe as simple as format conversion or the calculation of statistics, before passing it on to the display phase of the visualization system. In a post-mortem visualization system, the analysis and display may be simplified by the preprocessing of the trace file. The amount of analysis that may be performed in on-line visualization systems is limited due to time constraints.

IPS-2 is an example of a system which creats intermediate data structures such as procedure call, synchronization, and other flow graphs, as part of the analysis process. Pablo is an example of a system where the analysis phase is user-directed; user can manipulate a set of performance data transformation modules that the user can manipulate and interconnect graphically. EBBA is a high-level debugging tool that allows the user to specify models of program behavior consisting of abstract and primitive events. Clustering and filtering techniques are used to examine the stream of primitive events from the program, and applies a pattern-matching algorithm to construct user-specified abstract events. TraceViewer is a tool for detecting non-determinism.

One form of analysis peculiar to parallel systems is the determination of the order of events. At the simplest level, this might involve assignment of timestamps to events. Lamport's logical time ordering is convenient in obtaining a consistent casual ordering.

Display

The most critical issue in the visualization of a concurrent program is the ability to convey knowledge and intuition regarding the behavior of the program. Efficient displays must be capable of representing the interactions between processes with respect to synchronization and communication and their relationships; they must be capable of concurrently displaying events that were concurrent in the program. Scalability is one of the complex problems for displays. Displays must be created whose format, size, meaning, and effectiveness are independent of the number of processing elements involved.

Gthreads package provides an animated program call graph view, which is dynamically constructed as threads are forked, functions are called, and the point of execution of a thread moves from function to function. Conch provides a message passing view, where processes are arranged around the outside of the circle, and messages are represented as colored disks that move into the center of the ring when sent and out to the receiving process upon receipt. The AIMS system presents the network topology view of the system under study. Many debuggers show a running display of communications over time in their time-process diagrams. A number of performance evaluation systems provide displays of statistical information.

Above examples illustrate generic displays rather than application-specific displays. Application specific displays require a little more effort on the part of the developer. Voyeur, PARADISE, BALSA, Tango, Pavane, and POLKA are some of the examples.


Ethendranath N B
Last modified: Tue Mar 10 17:02:09 EST 1998