Project III

Due Date: November 24, 1997.

Getting Started

In this project we will develop a distributed service that can be accessed by a number of clients running at remote nodes. In particular, we will design, implement and evaluate a highly simplified distributed servive. Unix sockets will be used for communication between processes that run on different nodes.

Project Goals

The goal of the project is to write message passing distributed applications and to implement and evaluate some of the techniques that are covered in the distributed file systems papers. Basically, a set of clients can access data stored in N files. These files are numbered from 0..N-1 and are identified using a number. Copies of these files are initially stored at a server node. Client processes, running at different nodes, issue commands to open() and close() files. An open() call specifies if a client will only read the file or the file is also written. Once a file is opened in an appropriate mode, a client can issue read() and write() calls. These calls are executed with file copies that are cached at the client. A client can cache some number of files (cache size M << N). Thus, the client API consists of the following calls:

The implementation of these calls must be consistent with the following requirements:

  1. When a client issues a read or write request, the accessed file must be brought to the cache of the client node (if it already does not exist in it). Servicing of such cache misses requires determining where a current copy of the file exists, how to transmit it and where in the cache the file is placed. Another file may have to be replaced to make room for the incoming file.
  2. If a file is opened in read mode by a number of clients, they should be able to cache it locally and read it without communication with server on each access. However, for concurrent writes or a read and a write, it is necessary to coordinate the accesses to provide consistent access. We will adopt the following solution: a set of readers or single writer is allowed to cache a file at a given time. The server maintains a FIFO queue to decide which clients should be allowed to cache the file. On a close by a writer or last reader, the next client(s) will be provided a copy of the file. A write should also invalidate any read copies that were previously cached.
  3. We want to do some sort of evaluation of our system. In particular, we will measure the times of calls that are used to access files. Although controlled experiments for measuring the various costs may not be feasible, microbenchmarks should give us some insights into where the major costs are.

    Implementation Details

    To implement the described system, we need to come up with a design for the server as well as the clients, and define the various types of messages that will be exchanged between them. At the server, a table will be maintained that will have pointers to information that includes a current copy of the file, clients that have its copy and other relevant information. A similar table at the clients will have a fixed number of entries. Each entry will identify the cached file for which it stores information, the file data and other information that you may need.

    To simplify things, assume all files are of size 1Kbytes.

    You will need to multi-thread the client and server sides. A dispatcher thread in the server can receive client requests and fork worker threads to handle them. At the client side, we must allow the client to execute whatever code it wants mixed in with calls to read and write files. This means that we cannot assume that the client will receive and handle messages that arrive at it in a timely fashion. The problem of getting the client's attention when a message arrives can be solved in a number of ways. For example, we can use a Cthreads thread running on a separate logical processor as a handler thread. This handler thread waits on messages to arrive and handles them when they are received. Other options include periodic polling via the select() call or the use of signals.

    You can assume that the server is started first and it is initialized with copies of N files (you can set N to 64 -- files are numbered 0 -- 63). The inital state of all files should be a string of 1024 '.' characters. The socket address of the server is well known. To avoid the problems that arise due to unreliable communication, you can use TCP sockets. You can also assume that the set of clients (say 4) is statically determined. Each client cache can hold 8 files.

    In addition to the client API defined above, you can use an InitSystem() call that is executed before any read/writes are done. The read and write calls access the file number specified in the call parameter.

    You need to implement a replacement algorithm for the client caches. You are free to choose whatever replacement algorithm you like (as long as it is a reasonable one).

    Finally, for performance comparisons, you should do the following measurements:

    1. Time to read a complete file that is in local cache
    2. Time to read a complete file when it is fetched from server
    3. Time to write a complete file when one other client has its read copy

    Getting accurate times for the above operations is quite tricky in a shared environment. Also, you may not have access to a fine-grain timer to measure these times. You can repeat an operation many times and can choose the average time to deal with some of these problems.

    Testing

    In order to test the program, implement a client which reads in a list of commands from a text file and executes them. An example text file looks like this. Each line is a command, and the first character on that line specifies what the command is:

    Example Test Sequences

    Here are some test sequences that your program should work with. The grader will in addition try at least one more test set not listed here.

    Note: these test sequences have not yet been debugged; if something seems funny, feel free to email Lex about it.

    Test 1:

    Test 2:

    Test 3:

    Deliverables

    The following should be turned in:
    1. The program code. This should include the server, the client module, the test client program, and a program for doing some timing benchmarks. (These can overlap if desired). Please do not send binaries.
    2. Instructions on how to compile and run the program.
    3. A description of the communication protocol used between the program nodes. It would be enough to include a description of each message type specifying the direction of the message (client-to-server or server-to-client), the data included in the message, and a general description of the message's meaning.
    4. A documentation of the timing benchmarks performed and their results.
    Compilation instructions and any other comments may be included within program code comments if desired, but the other documentation items should be separate from the code.

    Other Notes