High availability software DSM: Beehive

Sponsor Kishore Ramachandran
rama@cc
217 CCB
Area Systems/Architecture

Problem
Beehive is a cluster system developed at Georgia Tech. It allows for shared memory style parallel programming on a cluster of Sun Ultras interconnected by a high speed network. The API allows the development of shared memory style parallel applications using one of two different memory consistency models: delta consistency (DC), and release consistency (RC). Beehive also implements a cooperative strategy for failure tolerance at the application level for individual node failures.  We have protocol enhancements to make the memory models
highly available (see the web links below).  Currently, however, only the highly available DC memory model has been implemented.  Your task is to work out the implementation of the highly available RC memory model in
the context of the Beehive system. You should write a short report describing how you will accomplish this goal.

If you are more adventurous you can actually look at the implementation to see how you will implement your design (not required for the course though).

Background:
http://www.cc.gatech.edu/computing/Architecture/Beehive/
(for a description of Beehive)

http://www.cc.gatech.edu/computing/Architecture/papers/temporal.ps.gz
(describes the temporal primitives in Beehive, as well as the software DSM architecture of Beehive)

http://www.cc.gatech.edu/computing/Architecture/papers/ha-temporal.ps.gz
(describes the high availability support for DC memory model in Beehive)

http://www.cc.gatech.edu/computing/Architecture/papers/ha-rc.ps.gz
(describes a highly available RC memory model)

Deliverables
An implementation document for the highly available RC memory model in Beehive.

Evaluation
Based on the completeness of the document.


updated by Kishore, 8/31/99, 5:30pm.