CS 4230/6236: Parallel and Distributed Simulation Systems

Fall Semester, 2005

Projects

Preliminaries

Below are descriptions of some suitable course projects.  You may complete one of these projects, or propose your own. Each of the projects defined below, except the last one, is intended to be completed by a single individual.

There are three deliverables for each project with due dates as indicated below:

·          Project Proposal: Friday, October 28

·          Checkpoint Report: Friday, November 18

·          Final Report: Friday, December 9

Project proposals should be emailed both to Professor Fujimoto and the TA.  In some cases, the project proposal need only state which project you plan to work on.  In other cases, some additional information is needed, as indicated below.

The project proposal should be a description of the project you plan to complete.  This will serve as a problem statement that will eventually become part of the checkpoint and final reports due later in the semester.

You are free to propose your own project. If you plan to define your own project you should see or send email to me first to discuss what you want to do.  We are relatively flexible with respect to the project, so long as it involves simulation in some interesting way.  In this case, the project selection write up should describe your project at a level of detail comparable to the project descriptions that follow. Be sure to indicate what milestones you will reach by the checkpoint, as well as the final report.

More than one person may select the same project. You may (and in fact are encouraged) to discuss how you will approach the project with other persons working on the same project, but all code and reports that you turn in must be entirely your own work.

Some of the projects that follow will require making some modifications to the FDK software. We will provide help and suggestions on how to modify this software to complete the project, but keep in mind that complete solutions have not been worked out.

Computer accounts have been set up for you on the systems lab machines.  Please be mindful there are other students using these machines for their research, and we are guests on these machines.  In particular be careful not to allow “runaway processes” executing.  The systems and general instructional clusters are the preferred platforms for project work, however you are free to use whatever computing facilities you have at your disposal.

Project 1: MPI-Based Distributed Simulation

The goal of this project is to realize a variation on FDK's time management software (TM-Kit) that runs over the MPI message-passing library, and maximize its performance. Specifically, the current FDK software computes LBTS values (a lower bound on the time stamp of future messages a processor can receive) using reduction computations.  If a reduction is completed and there are no transient messages discovered, the LBTS computation is complete.  If there are transient messages, the reduction computation is repeated until there are none.  The current software runs over a message-passing library called FM-Lib that is included with the FDK software.

You should (1) implement a version of TM-Kit that runs over MPI (rather than FM-Lib), (2) measure its performance, and (3) develop an optimized version intended to improve performance in some way, and (4) measure the performance of the improved version relative to the original.  Your objective is to build a very fast implementation.  For this purpose, the original version might be simply a port of the original TM-Kit implementation from FM-Lib to MPI, or a new version using the same API (application program interface) and you might consider using MPI’s reduction computations or building another time management algorithm over MPI for the optimized version.  You need not reuse the existing TM-Kit software at all, but you are free to do so if you wish.

To simplify the implementation of this mechanism, you should preserve the existing interface to the time management software where appropriate, but you may modify it if that would be useful.

The preferred version of MPI is accessed through /net/hj1/ihpcl/bin.  The mpicc/mpif77/mpirun commands are scripts which automatically detect the appropriate architecture and OS; the “raw” installation is in /net/hj1/ihpcl/`getarch`/mpich-1.2.5.2.  It was configured with the shared/distributed module turned on.

·          Checkpoint: Aim for having the current version of TM-Kit (or a new version you develop) running over MPI.  You should write a report describing both the current algorithm that is used (in greater detail than the description above!) the test set up for evaluating its performance, and initial performance measurements.  You should also describe what you intend to do to try to optimize its performance.  You will need to develop a benchmark program that you will use to test the performance of your implementation.  This benchmark program should be operational (over the MPI software).  Describe the benchmark program in your checkpoint report.

·          Final Report: The final report should provide complete documentation of your algorithm, discuss the benchmark program and test set up, as well as all of your performance results.  Explain the results (and any anomalies) you observed.  It is OK if the “optimized” version does not outperform the original, but you should be able to explain why.

Project 2: HLA Over Web Services

The goal of this project is to develop a client/server distributed simulation system built over web services (specifically, SOAP).  A key distinction between the system developed in this project and software such as FDK is this system uses a client/server rather than a peer-to-peer approach.  Specifically, you should aim for a “thin client” implementation where the simulation executes at the client, but distributed simulation services are implemented through a central server.

You need to implement a minimal set of distributed simulation services, sufficient to create time managed distributed simulations.  Specifically, you should implement a set of services similar to those defined in the HLA.  You should conform to the HLA APIs (e.g., as implemented in FDK) as much as possible. You will need to offer services for federation set up (federation management services), data exchange (declaration management and object management services) and time management.  You time management services must include both the NER and TAR services (or equivalent functionality).

In addition to implementing the services, you will need to implement test programs to exercise them, and benchmark programs to evaluate performance.  You should measure basic services such as the time required to update an attribute, as well as time to perform time management.  You may use programs like the tm_ping program provided with FDK for benchmarking the system.  You will chould compare the performance of the new system relative to FDK.

·          Checkpoint: Develop an initial implementation of some set of basic services, e.g., enough to create a federation with two federates and to exchange data.  You should report the performance of this initial implementation.

·          Final Report: The final report should describe your complete implementation, and report on its performance.  You should also include comparisons with the original FDK software.

Project 3: Aurora Performance Evaluation

This project involves performing an extensive performance evaluation of a recently developed distributed simulation system called Aurora, developed at Georgia Tech.  The system uses a master/worker approach to execute distributed simulation computations over (potentially) heterogeneous, distributed computing resources.  In this approach, a client machine wishing to perform computations downloads a work unit from a server machine.  Each work unit consists of one or more LPs and events that need to be processed.  After downloading the work unit, the LP(s) are processed, and the results (resulting state vector and events) are returned to the server, where they can be downloaded and processed by other clients.

It should be noted that Aurora was intended to be used in environments where client machines (and even the server) could be running other applications.  The system is intended to utilize machine cycles on clients that would otherwise go unused.  Aurora also allows greater fault tolerance capabilities than traditional parallel simulation programs where a processor failure typically results in the entire computation failing, unless a save/restart capability has been developed.

This project involves becoming familiar with Aurora, developing a significant benchmark application program (a parallel discrete event simulation; more sophisticated than queueing networks please!) and measuring its performance under different configurations. The application must be structured so you can reconfigure it to complete a variety of different experiments and compare the resulting performance. The applications must be aim for simulating relatively large systems.  The performance of the system will depend heavily on how much computation can be performed by a client each time it executes a work unit.  So, you should structure your LPs so each can model many simulated entities.  This way, by changing the number of entities mapped to an LP, you can vary the amount of computation done within a work unit, and evaluate the effect of changing this parameter on system throughput.

·          Proposal: The project proposal must describe the application program you will develop, and how it can be reconfigured for performing a range of experiments.

·          Checkpoint: An initial version of the application program should be operational.  You should have this running on Aurora by the first checkpoint, and be able to report initial performance measurements.

·          Final Report: A range of experiments should be developed and completed.

Project 4: Warp IV System Evaluation

Warp IV is a commercial distributed simulation software package developed by a company in California.  This project involves developing a parallel simulation application over Warp IV and reporting on its performance. You will need to perform a comprehensive evaluation of the software to assess its performance under a variety of test workloads.  You should also assess other aspects of Warp IV such as functionality, ease of use, and programmability (think of this as providing feedback to the developers). Information concerning Warp IV is available at http://www.warpiv.com.

·          Proposal: The project proposal must describe the application program you will develop, and how it can be reconfigured for performing a range of experiments.

·          Checkpoint: An initial version of the application program should be operational.  You should write an evaluation of your experiences (and difficulties, if any) developing the application and be able to report initial experiences in developing the application code.

·          Final Report: A range of experiments should be developed and completed.

Project 5: On-Line Distributed Transportation Simulation (and Handheld Computers)

This project involves developing a client/server-based distributed simulation intended for on-line use.  By on-line we mean the use of the simulation in real-time as a means to optimize an operational system; the simulation is initialized with the current state of the system, and experiments conducted to determine future states of the system, e.g., for performance optimization efforts.  The target application concerns transportation systems. It is envisioned the client machine might be operating within a vehicle, and could be used to predict future traffic conditions in order to determine the best route to reach a particular destination.  For this purpose, the client should download the current state of the transportation network (the portion relevant for the intended user) from a server to initialize its own data structures and then simulate forward to predict future traffic conditions.  This project involves realizing and experimenting an on-line simulation for transportation networks.

One reasonably straight-forward approach to traffic modeling is to use cellular automata.  A few online references to models include http://cui.unige.ch/~chopard/Traffic/ca-models.html (in particular, see http://cui.unige.ch/~dupuis/duisburg.pdf), http://sjsu.rudyrucker.com/~han.jiang/paper/#_Toc40035682, http://rcswww.urz.tu-dresden.de/~helbing/RoadApplet/ http://www.thp.uni-koeln.de/~as/Mypage/traffic.html, and http://www.sim.inf.ethz.ch/papers/nagel-etc-2lane/nagel-etc-2lane.pdf.  If you do a google search on 'cellular automata traffic modeling' or 'cellular automata traffic sim' you'll get some good sources.

There are many options in terms of the hardware and software environment to be used for this project.  One option is to run the client on a handheld computer.  Two Windows-based handhelds are available for use on this project (so at most two students can do complete this project using this option).  The application could be constructed using .net.

·          Proposal: The project proposal should outline the hardware and software environment you will use for the project.

·          Checkpoint: An initial version of the traffic simulation model to run on the client machine should be developed.  It should be developed in a way that is compatible with the envisioned distributed computing environment you will use.

·          Final Report: The complete system should include the ability to initialize the traffic simulator from state information available at the server, and then to execute the simulation forward.  Some basic measurements should also be completed concerning operations such as the time to upload state information from the server.  A full report describing the implementation should be developed.  The “current state” of the transportation network, as represented in the server, should be files you synthesize to test your system.

Project 6: Parallel Simulation of Virtualized Systems

This project is intended for students with interests in virtualization techniques.  Virtualization is a means for creating multiple virtual machines that operate over a single computing platform.  Different virtual machines may execute different operating systems, for instance.  The goal of this project is to develop a parallel simulation approach and an initial proof-of-concept prototype for simulating a set of virtualized machines executing on a single uniprocessor system.

Virtualization, by design, attempts to ensure that each virtual machine (VM) is isolated from the other VMs that are sharing the same hardware platform.  This suggests an approach focusing on the concurrent simulation of separate VMs offers some promise.  This project should examine the use of time parallel simulation techniques to parallelize the simulation of virtualization systems. Briefly, time parallel simulation involves dividing the simulation time axis into distinct non-overlapping time segments, and assigning different processors to concurrently perform the simulation of each segment.  Time parallel simulation has been successfully applied to simulating certain types of systems, e.g., cache memory systems using LRU replacement and ATM networks.  Here, a separate processor can be assigned to simulate each VM, offering concurrency up to the number of virtual machines being simulated.

Two issues must be addressed to achieve effective time parallel simulation of VMs.  First, at a functional level, dependencies between VMs, e.g., through accesses to secondary storage, may exist. Some provision for detecting and managing such dependencies must be developed.  Second, one VM may affect the performance of other VMs, e.g., by changing the contents of the processor’s cache and TLBs.  Use of “fix-up” computations that take into account such affect after the simulation of each time segment has been completed offer one possible approach to addressing this problem.  This project should explore both of these issues to the extent of explaining what are the main issues that need to be addressed, but should focus on the latter.  You must develop a time parallel simulation algorithm for virtualization system simulation, and realize of prototype demonstrating your approach to solving some of the problems you have identified.

·          Checkpoint: The checkpoint should give a description of the problem, issues, and discussion of possible solution approaches.  A time parallel simulation algorithm should be proposed.

·          Final Report: An implementation of the proposed simulation algorithm must be developed, and its performance evaluated.  The final report should describe your complete implementation, and report on its performance.

Project 7: Parallel Simulation of Multiprocessor Systems

This project is intended for students with interests in multiprocessor hardware architectures. The goal is to examine the use of optimistic parallel discrete event simulation techniques to perform instruction-level simulations of large-scale multiprocessor systems.  The large memory and computational requirements of such simulations and the inherent parallelism of the problem domain suggest the use of parallel simulation techniques. Optimistic synchronization techniques such as Jefferson’s Time Warp algorithm offer promise to addressing this issue, but have large memory requirements because past state information must be saved to allow one to roll back the computation.  The use of reverse execution techniques to greatly reduce the memory required to perform optimistic parallel simulations by re-computing past state information as needed, rather than using traditional state saving mechanisms.  Preliminary work using this approach has been completed [1] .  The purpose of this project is to continue this work by developing an initial proof-of-concept prototype and to measure its performance. 

This project involves (1) developing a simple optimistic parallel simulator, e.g., using MPI, and (2) developing a demonstration prototype multiprocessor simulator using reverse execution techniques.  Because of time constraints, you will not be able to develop a very sophisticated simulator, so you are free to make simplifying assumptions and manually develop test programs to exercise your system.

·          Checkpoint: Develop a design of the optimistic parallel simulator, and demonstrate that it is operational.

·          Final Report: The final report should describe the multiprocessor simulator you developed.  Time permitting, some initial performance measurements might also be included.

[1] K. Yang and R. Fujimoto, “Parallel Simulation of Multiprocessor Systems Using Reverse Execution,” Technical Report, 2004.

Project 8: Scalability of PDES for Continuous System Simulation

Simulations of physical systems frequently require solutions to partial differential equations (PDEs), which are traditionally solved using time stepping schemes. In recent years, discrete event simulations (DES) have been applied to solving such continuous systems, and achieved significant speedup in sequential simulations [1,2,3]. The irregular nature of DES, however, poses many challenges in parallel simulations, such as load balancing, communication, and synchronization.

The objectives of this project are two folds. The first objective is to grasp the concept of simulating continuous systems using PDES. The second objective is to study the tradeoff between communication and load balancing in parallel discrete event simulations for PDEs. In particular, load balancing and communication may pose contradicting requirements: For a one-dimensional problems, for example, a block mapping strategy (as used in [2, Chapter 9]) leads to poor load balance. A cyclic mapping strategy, on the other hand, introduces excessive communication overhead. A proper block-cyclic mapping may provide a better compromise and in turn deliver better scalability.

To conduct the study, you need to perform the following tasks:

   1. Implement a one-dimensional parallel DES code to solve the hyperbolic conservation law for a shock tube problem, as described in [2, Chapter 9] and [3].

   2. Perform experimental study of scalability using different partitioning strategies.

   3. Submit a project report to describe the implementation, summarize the experimental results, and propose a practical extension of the work (such as high-order formulation or dynamic load balancing).

·          Checkpoint: Develop an initial implementation of some set of basic services, e.g., enough to create a federation with two federates and to exchange data.

·          Final Report: The final report should describe your complete implementation, and report on its performance.

The project checkpoint and final report should include:

·          Checkpoint: Step 1 above should be completed

·          Final Report: Complete steps 2 and 3 above.

References

   1. Ernesto Kofman, Discrete Event Based Simulation and Control of Continuous Systems, Ph.D. thesis, Universidad Nacional de Rosario, Argentina, 2003.

   2. J. Nutaro, Parallel Discrete Event Simulation with Application to Continuous Systems, Ph.D. thesis, University of Arizona, 2004.

   3. J. Nutaro, B.P. Zeigler, R. Jammalamadaka, S. Akerkar, Discrete Event Solution of Gas Dynamics within the DEVS framework, Lecture Notes in Computer Science, vol 2660, pp. 319--328, 2003.

 

Project 9: Multi-Player Game

Define and implement an interactive multi-player game (of your choosing) to execute over the FDK RTI software.  The game must execute over multiple computers, and involve more than one interactive user.  You may develop the game “from scratch” or you may extend some existing game for execution on a distributed computing environment.  You may complete this project on an individual basis, or form a small team of up to three individuals.

Tasks for this project include:

·          Project selection: Give a brief description of the game you plan to create for the course project and indicate any partners if more than one individual will work on the project.

·          Checkpoint: Complete a design of the software.  You should turn in a report with the software architecture describing the main modules, their function, and briefly describe their interface to other modules.

·          Final report: The multi-player game should be completed, and a demo presented showing the pieces you were able to get working.  All results should be documented in the final report.