StreamGen: A Workload Generation Tool for Distributed Information Flow Applications
Overview
StreamGen is a workload generator
which is targeted at distributed information flow
applications. These include the event streaming services
used in wide-area publish/subscribe systems or in
operational information systems, the data streaming
services used in remote visualization or collaboration,
and the continuous data streams occurring in download
services. Running across heterogeneous distributed
platforms, these services are implemented by
computational component that capture, manipulate, and
produce information streams and are linked via overlay
topologies. StreamGen can be used to produce the
distributed computational and communication loads
imposed by these applications. Dynamic application
behaviors can be created with mathematical
specifications or with behavior traces collected from
application-level traces. An interesting set of traces
presented in this paper is derived from long-term
observations of the FTP download patterns observed at
the Linux mirror site being run by the CERCS research
center at the Georgia Institute of Technology.
Two different flow-based applications are created and
evaluated with StreamGen. The first emulates the data
streaming behavior in a distributed scientific
collaboration, where a scientific simulation (i.e., a
molecular dynamics code) produces simulation data sent
to and displayed for multiple, interactive remote users.
The second emulates portions of the event-streaming
behavior of an operational information system used by a
large U.S. corporation. Parametric studies with
StreamGen.s FTP traces applied to these applications are
used to evaluate different load balancing strategies for
the cluster machines manipulating these applications.
data streams.
The traces are too large to fit my available web quota. Please send me
email if you need them.
Publications
Mohamed Mansour, Matthew Wolf, and Karsten Schwan,
"A Workload Generation Tool for Distributed Information Flow Applications,"
International Conference on Parallel Processing (ICPP-04), 2004