[Georgia Tech][GVU][Research][Search]

PVaniM-GTW: Visualization Support for Parallel Simulations in Network Computing Environments


Parallel discrete event simulation systems (PDES) are used to simulate large-scale applications such as modeling telecommunication networks, transportation grids, and battlefield scenarios. While a large amount of PDES research has focused on employing multiprocessors and multicomputers, the use of networks of workstations interconnected through Ethernet or ATM has evolved into a popular effective platform for PDES. Nonetheless, the development of efficient PDES systems in network computing environments is not without its share of difficulties that severely degrade simulator performance. To better understand how these factors degrade performance as well as develop new algorithms to mitigate them, we investigate the use of graphical visualization to provide insight into performance evaluation and simulator execution. We began with a general purpose network computing visualization system, PVaniM, and used it to investigate the execution of an advanced version of Time Warp, called Georgia Tech Time Warp (GTW), which executes in network computing environments. Because PDES systems such as GTW are essentially middleware that support their own applications, we soon realized these systems require their own middleware-specific visualization support. To this end we have extended PVaniM into a new system, called PVaniM-GTW by adding middleware-specific views. Our experiences with PVaniM-GTW indicates that these enhancements enable one to better satisfy the needs of PDES middleware than general purpose visualization systems while also not requiring the development of application specific visualizations by the end user.

The following describes the middleware-specific views that have been added to PVaniM. A description of PVaniM's default views may be found here.

The Processor Advance Time (PAT) view is located to the right of the Host List view. The processor advance time for processor is defined as the amount of wall clock time needed to advance the simulation a single unit of simulation time. When the PAT values among the the host processors differ, there exists a load imbalance. The BGE algorithm migrates clusters of LPs to the appropriate machines such that the PAT values across all machines should be about equal. Consequently, this view gives an immediate indication how well the BGE algorithm is balancing the load. For processors not in use, their PAT value is zero.

Positioned to the right of the PAT view, the Clusters / Primary Rollbacks (PRBS) is a toggled view that displays either how the clusters are distributed among the active hosts or the percentage of events processed that are rolled back during the sampling interval due to a late arriving application message (a.k.a straggler message).

The clusters view is used in conjunction with other information to determine if the BGE algorithm is operating correctly. Primary rollbacks serve as one of the major indicators for GTW performance. The fewer events rolled back due to straggler messages results in a reduction in erroneous event computations, which ultimately yields better simulator performance.

The Secondary Rollbacks view is a toggling view (shared with PVaniM's Load view) shows the percentage of the events processed that are rolled back during the sampling interval due to the processing of an anti-message. This view provides insight into how far an erroneous computation has spread, by indicating the fan-out of LP communications links along which messages are scheduled in the application being simulated.

The Aborted Events view is a toggling view (shared with PVaniM's Memory Usage view) which shows the percentage of events processed that are aborted during the sampling interval. In the GTW system, a fixed number of event buffers are allocated during initialization and manages those buffers to avoid costly memory allocation system calls during runtime. An event is aborted if the scheduling of a future event fails because all event buffers are currently in use. This approach is used to prevent a processor from becoming overly optimistic. Usually, events are aborted because a slow GVT calculation process or a general lack of event buffers due to the large set of pending events. Like rollbacks, aborted events have a detrimental effect on system performance and should be avoided whenever possible.

PVaniM-GTW Research Team

Christopher Carothers
Parallel Simulation and Computer Architecture
College of Computing
Georgia Institute of Technology
Brad Topol
Graphics, Visualization & Usability Center
College of Computing
Georgia Institute of Technology
Richard Fujimoto
Parallel Simulation and Computer Architecture
College of Computing
Georgia Institute of Technology
John Stasko
Graphics, Visualization & Usability Center
College of Computing
Georgia Institute of Technology
Vaidy Sunderam
Department of Math and Computer Science
Emory University


Back to Software Visualization Home Page
Questions or comments? Email gvu-webmaster@cc.gatech.edu.