Catamaran: Using Replication to Build High-Performance Distributed Systems

Principal Investigators: Mustaque Ahamad and Mostafa Ammar ,
College of Computing ,
Georgia Institute of Technology , Atlanta, GA 30332.

The proposed National Information Infrastructure (NII) should make it feasible to efficiently share information across a large number of widely distributed users. The intense desire among the public for the large scale, ubiquitous availability of information is likely to provide the economic justification for the deployment of the NII. This project investigates how to improve the efficiency of data access using a variety of techniques based on replication and caching. One of our primary concerns is the provision of low-cost systems that can efficiently support a very large user population. Needless to say, our ability to provide low-cost large scale information access can have far-reaching impacts on a variety of economic sectors.

A novel feature of our work is the explicit consideration of the interaction between the support provided by the networking infrastructure and the performance exhibited by particular applications. Our work is aimed at building a prototype information system that is centered around two main ideas: 1) the exploitation of flexible and dynamic data consistency semantics, and 2) the exploration of novel communication paradigms to support scalable information sharing systems in general.

The consistency problem arises in large scale distributed systems because multiple copies of a shared data item are often created. This replication can be due to caching for improved performance or to increase the availability of information when nodes and the communication network can experience failures. The consistency problem deals with the question: if a copy of shared information at a node is updated, how and when should the changes be reflected in other copies? Traditional consistency conditions which require that all users view changes to shared information in a single order are not appropriate in large scale systems. In fact, such consistency either cannot be provided (e.g., when communication cannot be completed because a user is disconnected) or can introduce excessive delays in processing access requests.

We have developed a framework for precisely defining a range of consistency that is appropriate for different types of applications. Such characterization has led to the development of new algorithms that provide efficient and scalable solutions to information sharing. We have also explored techniques that can be used to identify the appropriate level of consistency when the sharing needs of an application are known. We are currently building a prototype system that will allow us to evaluate many of the new consistency algorithms and communication protocols.

Multipoint communication is an integral part of the data access technology we are developing. Multipoint communication can be viewed as communication "replication" (as opposed to the traditional storage replication). Much of the work on multipoint communication so far has been in the context of video distribution or video-conferencing. For our data access applications we require stronger reliability and have developed a reliable multicast transport protocol that is based on the Internet's TCP protocol. It can be easily integrated within the Internet framework. We are currently making this software available for others to experiment with. As a first step to understanding the issues involved in scalable information access services, we have also considered performance problems that are experienced by World Wide Web servers. We have proposed a technique based on our multicast transport protocol that could help in reducing the load experienced by a server and provide faster response time to clients.


More Information

The slides of an interesting presentation on the topic can be obtained by clicking here .

Maintainer:

Prince Kohli (pkohli@cc.gatech.edu)
Last Modified Tue Aug 29