Project 2: Barrier Synchronization
Goal
The goal of this assignment is to introduce OpenMP,
MPI and barrier synchronization concepts. You will implement
several barriers using OpenMP, MPI and synchronize between multiple
threads and machines. You will document the individual contributions
of each team member in your project write-up.
General Information
- Read this assignment
carefully, in its entirety, before you start coding - it may
save you a lot of time later!
- You may use any machine where
you have access to Open MP and MPI.
- Use the reference pointers
below to find concrete technical information (e.g., OpenMP, MPI tutorials
and examples).
- If you have any questions,
ASK! Use the newsgroup for broad questions. (And if you see a question in
the newsgroup and you know the answer -- post it!)
Resources
Relevant reading with barrier implementation and testing:
- Umakishore
Ramachandran, Gautam Shah, S. Ravikumar, and Jeyakumar
Muthukumarasamy. "Scalability Study
of the KSR-1", Parallel Supercomputing, Vol 22, 1996,
739-759.
- Mellor-Crummey, J. M. and Scott, M., "Algorithms for Scalable
Synchronization on Shared-Memory Multiprocessors ", ACM
Transactions on Computer Systems, Feb. 1991.
Usage
Compilers:
- You will need to
use Linux systems for this assignment. Intel's icc is for Linux
and I don't have mpich built for Solaris.
- For OpenMP
compilation, you should use the icc (/usr/local/bin/icc)
compiler. The most useful option for this project is -openmp. Man
pages accessible by "man -M /usr/local/icc-8.0/man icc" (or
icc-9.0 if the icc in /usr/local/bin is 9.0).
- For MPI
compilation, see instructions on MPI introduction page. Use the
script "mpicc" found in /net/hp96/davidhi/mpich-1.2.7-rh9-gcc or
/net/hp96/davidhi/mpich-1.2.7-rhel4-gcc. Choose the version
based on the system you are using (RedHat 9 or Red Hat Enterprise
Linux 4), which you can check in /etc/redhat-release. Most
cluster machines are rhel4 while many of the remote access
machines are still rh9. These systems have significantly
different versions of gcc and glibc, so you could run into issues
if you use the wrong one.
- Here is a sample
Makefile.
MPI Specifics:
- MPI requires a
mechanism for launching processes on remote machines. Traditionally,
it has used rsh/rlogin, but these are being phased out in favor of
ssh/slogin (because rsh/rlogin use plain text passwords).
Most RedHat 9 systems in the college still use rsh, while none of
the Red Hat Enterprise Linux 4 systems do. Both require some
simple set up for passwordless authentication. The rh9 and
rhel4 mpich libraries I have provided are built to use rsh and
ssh respectively.
- For rsh-based
systems, you'll need to set up your .rhosts file. Simply
'echo "+@all-cc " > ~/.rhosts' and
'chmod 600 ~/.rhosts'. Now if it is working right,
you should be able to type 'rlogin ' to
log in to a RH9 system supporting rsh without a password.
- For ssh-based
authentication, you'll need to set up a ssh key for the CoC
machines. Execute 'ssh-keygen -t dsa' If, at this stage, you
provide a password (which is the secure and recommended option),
you can still do passwordless authentication using ssh-agent.
Next put the contents of your ~/.ssh/id_dsa.pub into the file
~/.ssh/authorized_keys (e.g. cat ~/.ssh/id_dsa >> ~/.ssh/authorized_keys).
When you log into a machine, type 'ssh-add' and it will prompt
you for your password. After that you will be able to ssh to any
machine in the CoC from this current session without entering
your password. If 'ssh-add' complains about being unable to
connect to your authentication agent, type 'eval
`/usr/bin/ssh-agent -s`' and try again. Test the password free
connection by ssh-ing to another CoC machine.
- If you'd like
to use the more secure ssh-based authentication on the RH9
systems, simply set the shell variable RSHCOMMAND=ssh (e.g.
'declare -x RSHCOMMAND=ssh' or 'export RSHCOMMAND=ssh')
Specifics
Construct:
1. Use OpenMP to implement a barrier between threads on a single
machine. Create several threads on one machine using basic OpenMP
threads. Use the OpenMP locking calls to implement barrier
synchronization between them. You can choose any barrier algorithm
you want, but implement at least two distinct algorithms.
2. Use MPI to implement synchronization between different OpenMP
machines (a barrier where each entity is an individual machine).
After all the threads on an individual machine reach the OpenMP
barrier, the machine will try to achieve the MPI barrier.
There are several
machines, and we have one OpenMP barrier on each machine to
synchronize several threads. Implement a tree-like MPI
barrier. Interface the built-in OpenMP barrier with your MPI barrier
call to synchronize a group of machines.
3. Scale the number of machines from 1 to 8.
Evaluate:
1. Write a testing harness that times many iterations of each
barrier call. Collect performance results for a different number of
machines (1-8) and a different number of threads (1-4) per machine.
Explain the results. Try unequal numbers of threads on different
machines. What is the difference when the threads are evenly
distributed on each machine? You don't have to provide a huge number
of different testing configurations, but justify and document your
choices. Remember to describe your testing hardware
configurations!
2. Compare the different algorithms you used to implement the
OpenMP barrier. Explain the results.
Deliverables
- When is it due: Wednesday, Oct. 5th
- What to turn in:
- Your OpenMP barrier implementations (at least two)
- Your MPI barrier implementations (at least a tree barrier)
- Your performance tests and related code
- Makefile
- A write-up documenting your performance results and barrier designs
- Documentation of the contributions of each team member
- A
README file including:
- What platform do you use?
- How to compile your source and run your program(s).
- Any thoughts you have on the project, including things that
work especially well or which don't work.
- How: See the generic project turnin instructions here