CS
4230/6236: Parallel and Distributed Simulation Systems
Fall
Semester, 2005
Projects
Preliminaries
Below are descriptions of some suitable course projects. You may complete one of these projects,
or propose your own. Each of the projects defined below, except the last one,
is intended to be completed by a single individual.
There are three deliverables for each project with due dates as indicated
below:
·
Project Proposal: Friday, October 28
·
Checkpoint Report: Friday, November 18
·
Final Report: Friday, December 9
Project proposals should be emailed both to Professor Fujimoto and the TA. In some cases, the project proposal
need only state which project you plan to work on. In other cases, some additional information is needed, as
indicated below.
The project proposal should be a description of the project you plan to
complete. This will serve as a
problem statement that will eventually become part of the checkpoint and final
reports due later in the semester.
You are free to propose your own project. If you plan to define your own
project you should see or send email to me first to discuss what you want to
do. We are relatively flexible
with respect to the project, so long as it involves simulation in some
interesting way. In this case, the
project selection write up should describe your project at a level of detail
comparable to the project descriptions that follow. Be sure to indicate what
milestones you will reach by the checkpoint, as well as the final report.
More than one person may select the same project. You may (and in fact are
encouraged) to discuss how you will approach the project with other persons
working on the same project, but all code and reports that you turn in must be
entirely your own work.
Some of the projects that follow will require making some modifications to
the FDK software. We will provide help and suggestions on how to modify this
software to complete the project, but keep in mind that complete solutions have
not been worked out.
Computer accounts have been set up for you on the systems lab machines. Please be mindful there are other
students using these machines for their research, and we are guests on these
machines. In particular be careful
not to allow “runaway processes” executing. The systems and general instructional clusters are the
preferred platforms for project work, however you are free to use whatever
computing facilities you have at your disposal.
Project 1: MPI-Based
Distributed Simulation
The goal of this project is to realize a variation on FDK's time management
software (TM-Kit) that runs over the MPI message-passing library, and maximize
its performance. Specifically, the current FDK software computes LBTS values (a
lower bound on the time stamp of future messages a processor can receive) using
reduction computations. If a
reduction is completed and there are no transient messages discovered, the LBTS
computation is complete. If there
are transient messages, the reduction computation is repeated until there are
none. The current software runs
over a message-passing library called FM-Lib that is included with the FDK
software.
You should (1) implement a version of TM-Kit that runs over MPI (rather than
FM-Lib), (2) measure its performance, and (3) develop an optimized version
intended to improve performance in some way, and (4) measure the performance of
the improved version relative to the original. Your objective is to build a very fast implementation. For this purpose, the original version
might be simply a port of the original TM-Kit implementation from FM-Lib to
MPI, or a new version using the same API (application program interface) and
you might consider using MPI’s reduction computations or building another time
management algorithm over MPI for the optimized version. You need not reuse the existing TM-Kit
software at all, but you are free to do so if you wish.
To simplify the implementation of this mechanism, you should preserve the
existing interface to the time management software where appropriate, but you
may modify it if that would be useful.
The preferred version of MPI is
accessed through /net/hj1/ihpcl/bin.
The mpicc/mpif77/mpirun commands are scripts which automatically detect
the appropriate architecture and OS; the “raw” installation is in /net/hj1/ihpcl/`getarch`/mpich-1.2.5.2. It was configured with the
shared/distributed module turned on.
·
Checkpoint: Aim for having the current version of
TM-Kit (or a new version you develop) running over MPI. You should write a report describing
both the current algorithm that is used (in greater detail than the description
above!) the test set up for evaluating its performance, and initial performance
measurements. You should also
describe what you intend to do to try to optimize its performance. You will need to develop a benchmark
program that you will use to test the performance of your implementation. This benchmark program should be
operational (over the MPI software).
Describe the benchmark program in your checkpoint report.
·
Final Report: The final report should provide complete
documentation of your algorithm, discuss the benchmark program and test set up,
as well as all of your performance results. Explain the results (and any anomalies) you observed. It is OK if the “optimized” version
does not outperform the original, but you should be able to explain why.
Project 2: HLA Over
Web Services
The goal of this project is to develop a client/server distributed
simulation system built over web services (specifically, SOAP). A key distinction between the system
developed in this project and software such as FDK is this system uses a
client/server rather than a peer-to-peer approach. Specifically, you should aim for a “thin client”
implementation where the simulation executes at the client, but distributed simulation
services are implemented through a central server.
You need to implement a minimal set of distributed simulation services,
sufficient to create time managed distributed simulations. Specifically, you should implement a
set of services similar to those defined in the HLA. You should conform to the HLA APIs (e.g., as implemented in
FDK) as much as possible. You will need to offer services for federation set up
(federation management services), data exchange (declaration management and
object management services) and time management. You time management services must include both the NER and
TAR services (or equivalent functionality).
In addition to implementing the services, you will need to implement test
programs to exercise them, and benchmark programs to evaluate performance. You should measure basic services such
as the time required to update an attribute, as well as time to perform time
management. You may use programs
like the tm_ping program provided with FDK for benchmarking the system. You will chould compare the performance
of the new system relative to FDK.
·
Checkpoint: Develop an initial implementation of some
set of basic services, e.g., enough to create a federation with two federates
and to exchange data. You should
report the performance of this initial implementation.
·
Final Report: The final report should describe your
complete implementation, and report on its performance. You should also include comparisons
with the original FDK software.
Project 3: Aurora
Performance Evaluation
This project involves performing an extensive performance evaluation of a
recently developed distributed simulation system called Aurora, developed at
Georgia Tech. The system uses a
master/worker approach to execute distributed simulation computations over
(potentially) heterogeneous, distributed computing resources. In this approach, a client machine
wishing to perform computations downloads a work unit from a server machine. Each work unit consists of one or more LPs and events that
need to be processed. After
downloading the work unit, the LP(s) are processed, and the results (resulting
state vector and events) are returned to the server, where they can be
downloaded and processed by other clients.
It should be noted that Aurora was intended to be used in environments where
client machines (and even the server) could be running other applications. The system is intended to utilize
machine cycles on clients that would otherwise go unused. Aurora also allows greater fault
tolerance capabilities than traditional parallel simulation programs where a
processor failure typically results in the entire computation failing, unless a
save/restart capability has been developed.
This project involves becoming familiar with Aurora, developing a significant
benchmark application program (a parallel discrete event simulation; more
sophisticated than queueing networks please!) and measuring its performance
under different configurations. The application must be structured so you can
reconfigure it to complete a variety of different experiments and compare the
resulting performance. The applications must be aim for simulating relatively
large systems. The performance of
the system will depend heavily on how much computation can be performed by a
client each time it executes a work unit.
So, you should structure your LPs so each can model many simulated
entities. This way, by changing
the number of entities mapped to an LP, you can vary the amount of computation
done within a work unit, and evaluate the effect of changing this parameter on
system throughput.
·
Proposal: The project proposal must describe the
application program you will develop, and how it can be reconfigured for
performing a range of experiments.
·
Checkpoint: An initial version of the application
program should be operational. You
should have this running on Aurora by the first checkpoint, and be able to
report initial performance measurements.
·
Final Report: A range of
experiments should be developed and completed.
Project 4: Warp IV
System Evaluation
Warp IV is a commercial distributed simulation software package developed by
a company in California. This
project involves developing a parallel simulation application over Warp IV and
reporting on its performance. You will need to perform a comprehensive
evaluation of the software to assess its performance under a variety of test
workloads. You should also assess
other aspects of Warp IV such as functionality, ease of use, and
programmability (think of this as providing feedback to the developers). Information
concerning Warp IV is available at http://www.warpiv.com.
·
Proposal: The project proposal must describe the
application program you will develop, and how it can be reconfigured for
performing a range of experiments.
·
Checkpoint: An initial version of the application
program should be operational. You
should write an evaluation of your experiences (and difficulties, if any)
developing the application and be able to report initial experiences in
developing the application code.
·
Final Report: A range of experiments should be developed
and completed.
Project 5: On-Line
Distributed Transportation Simulation (and Handheld Computers)
This project involves developing a client/server-based distributed
simulation intended for on-line use.
By on-line we mean the use of the simulation in real-time as a means to
optimize an operational system; the simulation is initialized with the current
state of the system, and experiments conducted to determine future states of
the system, e.g., for performance optimization efforts. The target application concerns
transportation systems. It is envisioned the client machine might be operating
within a vehicle, and could be used to predict future traffic conditions in
order to determine the best route to reach a particular destination. For this purpose, the client should
download the current state of the transportation network (the portion relevant
for the intended user) from a server to initialize its own data structures and
then simulate forward to predict future traffic conditions. This project involves realizing and
experimenting an on-line simulation for transportation networks.
One reasonably straight-forward approach to traffic modeling is to use
cellular automata. A few online
references to models include http://cui.unige.ch/~chopard/Traffic/ca-models.html
(in particular, see http://cui.unige.ch/~dupuis/duisburg.pdf),
http://sjsu.rudyrucker.com/~han.jiang/paper/#_Toc40035682,
http://rcswww.urz.tu-dresden.de/~helbing/RoadApplet/
http://www.thp.uni-koeln.de/~as/Mypage/traffic.html,
and http://www.sim.inf.ethz.ch/papers/nagel-etc-2lane/nagel-etc-2lane.pdf. If you do a google search on 'cellular
automata traffic modeling' or 'cellular automata traffic sim' you'll get some
good sources.
There are many options in terms of the hardware and software environment to
be used for this project. One
option is to run the client on a handheld computer. Two Windows-based handhelds are available for use on this
project (so at most two students can do complete this project using this
option). The application could be
constructed using .net.
·
Proposal: The project proposal should outline the
hardware and software environment you will use for the project.
·
Checkpoint: An initial version of the traffic
simulation model to run on the client machine should be developed. It should be developed in a way that is
compatible with the envisioned distributed computing environment you will use.
·
Final Report: The complete system should include the
ability to initialize the traffic simulator from state information available at
the server, and then to execute the simulation forward. Some basic measurements should also be
completed concerning operations such as the time to upload state information
from the server. A full report
describing the implementation should be developed. The “current state” of the transportation network, as
represented in the server, should be files you synthesize to test your system.
Project 6: Parallel
Simulation of Virtualized Systems
This project is intended for students with interests in virtualization
techniques. Virtualization is a
means for creating multiple virtual machines that operate over a single
computing platform. Different
virtual machines may execute different operating systems, for instance. The goal of this project is to develop
a parallel simulation approach and an initial proof-of-concept prototype for
simulating a set of virtualized machines executing on a single uniprocessor
system.
Virtualization, by design, attempts to ensure that each virtual machine (VM)
is isolated from the other VMs that are sharing the same hardware
platform. This suggests an
approach focusing on the concurrent simulation of separate VMs offers some
promise. This project should
examine the use of time parallel
simulation techniques to parallelize the simulation of virtualization systems.
Briefly, time parallel simulation involves dividing the simulation time axis
into distinct non-overlapping time segments, and assigning different processors
to concurrently perform the simulation of each segment. Time parallel simulation has been
successfully applied to simulating certain types of systems, e.g., cache memory
systems using LRU replacement and ATM networks. Here, a separate processor can be assigned to simulate each
VM, offering concurrency up to the number of virtual machines being simulated.
Two issues must be addressed to achieve effective time parallel simulation
of VMs. First, at a functional level, dependencies between VMs, e.g., through
accesses to secondary storage, may exist. Some provision for detecting and
managing such dependencies must be developed. Second, one VM may affect the performance of other VMs, e.g., by changing the contents of the
processor’s cache and TLBs. Use of
“fix-up” computations that take into account such affect after the simulation
of each time segment has been completed offer one possible approach to
addressing this problem. This
project should explore both of these issues to the extent of explaining what
are the main issues that need to be addressed, but should focus on the latter. You must develop a time parallel simulation
algorithm for virtualization system simulation, and realize of prototype
demonstrating your approach to solving some of the problems you have
identified.
·
Checkpoint: The checkpoint should give a description of
the problem, issues, and discussion of possible solution approaches. A time parallel simulation algorithm
should be proposed.
·
Final Report: An implementation of the proposed
simulation algorithm must be developed, and its performance evaluated. The final report should describe your
complete implementation, and report on its performance.
Project 7: Parallel
Simulation of Multiprocessor Systems
This project is intended for students with interests in multiprocessor
hardware architectures. The goal is to examine the use of optimistic parallel
discrete event simulation techniques to perform instruction-level simulations
of large-scale multiprocessor systems.
The large memory and computational requirements of such simulations and
the inherent parallelism of the problem domain suggest the use of parallel
simulation techniques. Optimistic synchronization techniques such as
Jefferson’s Time Warp algorithm offer promise to addressing this issue, but
have large memory requirements because past state information must be saved to
allow one to roll back the computation.
The use of reverse execution techniques to greatly reduce the memory
required to perform optimistic parallel simulations by re-computing past state
information as needed, rather than using traditional state saving mechanisms. Preliminary work using this approach has
been completed [1] . The purpose of this project is to continue this work by
developing an initial proof-of-concept prototype and to measure its performance.
This project involves (1) developing a simple optimistic parallel simulator,
e.g., using MPI, and (2) developing a demonstration prototype multiprocessor
simulator using reverse execution techniques. Because of time constraints, you will not be able to develop
a very sophisticated simulator, so you are free to make simplifying assumptions
and manually develop test programs to exercise your system.
·
Checkpoint: Develop a design of the optimistic parallel
simulator, and demonstrate that it is operational.
·
Final Report: The final report should describe the
multiprocessor simulator you developed.
Time permitting, some initial performance measurements might also be
included.
[1] K. Yang and R. Fujimoto, “Parallel Simulation of Multiprocessor Systems
Using Reverse Execution,” Technical Report, 2004.
Project 8: Scalability
of PDES for Continuous System Simulation
Simulations of physical systems frequently require solutions to partial
differential equations (PDEs), which are traditionally solved using time
stepping schemes. In recent years, discrete event simulations (DES) have been
applied to solving such continuous systems, and achieved significant speedup in
sequential simulations [1,2,3]. The irregular nature of DES, however, poses
many challenges in parallel simulations, such as load balancing, communication,
and synchronization.
The objectives of this project are two folds. The first objective is to
grasp the concept of simulating continuous systems using PDES. The second
objective is to study the tradeoff between communication and load balancing in parallel
discrete event simulations for PDEs. In particular, load balancing and
communication may pose contradicting requirements: For a one-dimensional
problems, for example, a block mapping strategy (as used in [2, Chapter 9])
leads to poor load balance. A cyclic mapping strategy, on the other hand,
introduces excessive communication overhead. A proper block-cyclic mapping may
provide a better compromise and in turn deliver better scalability.
To conduct the study, you need to perform the following tasks:
1. Implement a
one-dimensional parallel DES code to solve the hyperbolic conservation law for
a shock tube problem, as described in [2, Chapter 9] and [3].
2. Perform experimental
study of scalability using different partitioning strategies.
3. Submit a project
report to describe the implementation, summarize the experimental results, and
propose a practical extension of the work (such as high-order formulation or
dynamic load balancing).
·
Checkpoint: Develop an initial implementation of some
set of basic services, e.g., enough to create a federation with two federates
and to exchange data.
·
Final Report: The final report should describe your
complete implementation, and report on its performance.
The project checkpoint and final report should include:
·
Checkpoint: Step 1 above should be completed
·
Final Report: Complete steps 2 and 3 above.
References
1. Ernesto Kofman,
Discrete Event Based Simulation and Control of Continuous Systems, Ph.D.
thesis, Universidad Nacional de Rosario, Argentina, 2003.
2. J. Nutaro, Parallel
Discrete Event Simulation with Application to Continuous Systems, Ph.D. thesis,
University of Arizona, 2004.
3. J. Nutaro, B.P.
Zeigler, R. Jammalamadaka, S. Akerkar, Discrete Event Solution of Gas Dynamics
within the DEVS framework, Lecture Notes in Computer Science, vol 2660, pp.
319--328, 2003.
Project 9:
Multi-Player Game
Define and implement an interactive multi-player game (of your choosing) to
execute over the FDK RTI software.
The game must execute over multiple computers, and involve more than one
interactive user. You may develop
the game “from scratch” or you may extend some existing game for execution on a
distributed computing environment.
You may complete this project on an individual basis, or form a small
team of up to three individuals.
Tasks for this project include:
·
Project selection: Give a brief description of the game
you plan to create for the course project and indicate any partners if more than
one individual will work on the project.
·
Checkpoint: Complete a design of the software. You should turn in a report with the
software architecture describing the main modules, their function, and briefly
describe their interface to other modules.
·
Final report: The multi-player game should be
completed, and a demo presented showing the pieces you were able to get
working. All results should be
documented in the final report.