Visualization for Parallel Performance Evaluation and Optimization
Parallel systems promise significant performance benefits over
uniprocessor or non-distributed systems; however, it is hard to
realise the full potential of parallel systems in practice. It is very
hard to analyze and tune the performance of parallel systems. Tha
main goal for using visualization is to help reason about and
understand performance of parallel and concurrent programs. Continuous
evolution in the design and use of parallel systems has been a
stumbling block in the development of visualization techniques for
parallel performance evaluation.
The main challenge lies in visually representing the mental models
users apply in attempting to understand performance information. It is
challenging to relate performance information back to the abstractions
that the user understands, particularly when a performance view that
appeals to one person's mental model may have little in common with
models held by others. The solution is that the visualization
techniques and methods used to construct graphic displays of the data
must be closely integrated with the models of parallel computation the
data represent.
In the following section, we discuss a paralel performance visualization
model and an underlying theory for applying visualization principles. We
then discuss visualization principles and scenarios.
A Parallel Performance Visualization Model
A proposed high-level model of parallel performance visualization
highlights the architectural relationship between performance data
analysis and performance display, emphasizing the different aspects of
visualization development. The model is based on the following notions:
- Performance analysis abstraction
- Performance view
- Performance visual abstraction
- Performance display
- Performance visualization abstraction
The key point in the model is that the performance visual design can
and should incorporate knowledge of the performance analysis
abstraction very early on, providing the basis for performance
interpretation in the final visualization. A performance visualization
abstraction must be instantiated in a performance visualizer tool that
implements performance views, displays, and their mappings using
environment-specific graphics technology, based on underlying graphics
libraries, toolkits, and other resources.
Performance Visualization Evolution
Much of the early work in this area focused on development of specific
performance displays derived mostly from statistical graphics and
adapted to represent performance data; Kiviat diagrams and Gantt
charts are good examples. The work on multiple views and graphical
performance representations began to define a theory of performance
visualization. The integration and support for multiple views and
displays represented a software engineering challenge. The
environments began to support more user interaction in selecting
view-display combinations and in specifying view-display attributes.
Recently, performance visualization research has shifted its emphasis
to the technology required to generate application-specific
performance visualizations. The three primary objectives of this recent
work are:
- to exploit the perceptual capabilitites of sophisticated graphics
in the design of performance displays
- to provide support for high-level performance visual abstractions
and their instantiation through visualization languages, graphics libraries,
and data visualization environments
- to involve the user directly in prototyping and customizing performance
displays so that meaningful performance visualizations can be readily
constructed and evaluated.
Visualization Concepts and Principles
In this section, we present a classification of the concepts and
principles that are found across the range of performance tools cited
in the literature.
- Context
- Perspective
- Semantic Context
- Subview mapping
- Scaling
- Multidimensional and multivariate representation
- Macroscopic and microscopic views
- Macro/micro composition and reading
- Adaptive graphical display
- Display manipulation
- Composite view
- User Perception and Interaction
- Perception and cognition
- Observing patterns
- User interaction
- Comparison
- Multiple views
- Small multiples
- Cross-execution views
- Extraction of Information
- Reduction and filtering
- Clustering
- Encoding and abstracting
- Separating information
Visualization Scenarios
In this section, we discuss some of the performance evaluation problems
encountered and the range of possible visualization applications.
- Processor Utilization, Concurrency, and Load Balance
Kiviat diagrams, for example, depict relationships among multivariate
performance data, such as the utilization of various resources in
computer systems. To depict processor utilization, each processor is
represented in the diagram by a separate radial axis, and its
percentage utilization determines a point along this axis, from 0 at
the origin and 100 % at the perimeter. Connecting these points by
straight lines forms a polygon whose shape and size give a quick
visual indication of overall efficiency, individual processor
utilization, and the load balance across processors.
Utilization data can be integrated over the lifetime of the
computation and broken down into busy, overhead, and idle values,
giving a quick and effective visual impression of load balance and
overhead, but no insight into concurrency.
This deficiency is the remedied by the timeline display, where the
same information is now given as a function of time, so that specific
periods of busy and idle activity on specific processors can be
identified and correlated across processors.
The Kiviat diagram and timeline display above demonstrate a tension
that someone exists in performance visualization between views that
try to convey impressions of overall performance behavior and those
that support the presentation of detailed information. The visual
abstraction of the Kiviat diagram can be extended to show a history of
processor utilization values in the form of a "Kiviat tube", formed
from two-dimensional Kiviat "slices" layed out in time along the third
dimension.
- Critical Paths in Parallel Computation
The large amounts of data in parallel programs makes it extremely
difficult to identify the source of performance problems. The critical
path is the longest serial thread, or chain of dependences, running
through the execution of a parallel program. The execution time of the
program cannot be reduced without shortening the critical path, and
hence it is a potential place to look for bottlenecks. For parallel
programs based on message passing, an appropriate visual abstraction
for depicting the critical path is a minor modification of a spacetime
diagram, since data dependencies are satisfied by interprocessor
communications.
- Access Patterns for Data Distributions
Parallel programming languages, such as High Performance Fortran or
parallel C++ (pC++), incorporate data distribution semantics.
Interprocess communication in such languages is implicitly determined
by the data distribution, and hence the selection of the distribution
is the programmer's main control over parallel efficiency. Thus, an
appropriate analysis abstraction is the proportion of local versus
remote data accesses required to support a given choice of data
distribution.
Ethendranath N B
Last modified: Tue Mar 10 17:02:17 EST 1998