CS6210 Advanced Operating Systems
Spring 2002
Programming Project III
Due Date: Due: April 9, 2002


As in the second project, you can work in groups of at most two to complete this project.

In this project, which must be submitted by the due date, you will build a simple distributed file system (DFS) that makes use of several of the techniques that have been discussed in class. In particular, the DFS will cache file data in memories of other clients as is done in cooperative caching. Second, the system will try to keep copies of a dirty block  in the memories of at least two nodes so the data is not lost due to a single client failure.

To avoid problems that arise in translating file names and locating the file server for a named file, we will assume a single server, and low level names, which are UIDs. Assume that UIDs range from 0 to n. The single server runs at a well known location. Since data blocks that are of interest to other client nodes may exist at a node even after applications accessing the file at the node terminate, we need to keep a separate file cacher daemon running at each node. An application process must include a library that translates file requests made by the application and sends them to the local cacher daemon. The local daemon handles such requests and also communicates with the file server or daemons running at other nodes. To provide improved performance and better scalability, we will require that files be cached cooperatively by  the cacher daemons. Furthermore, to improve availability, the cachers will attempt to keep at least two copies of dirty blocks at two different nodes. Since availability is provided by multiple  memory copies at nodes that have independent failure modes, it is not necessary that dirty data be written back to the server quickly to tolerate failures of client nodes. The caches at client nodes will be managed by the cacher daemons running at each client node.

Since space in client caches will be limited, you need to devise  a scheme to keep copies of those blocks that have recently been accessed by application processes. For example, if a block is to be replaced, it is best to replace one that has more than two copies in cacher memories. If no such blocks exist,  it is desirable that the block that has not been accessed for the longest time is replaced. If a dirty block is replaced, it must be sent to the server. Once the server gets a dirty block, it becomes clean for the clients that have its copies. Clearly,  cacher daemons at client nodes and the server will have to keep and use meta-data in order to determine when and what blocks should be replaced.

The file server will make use of a number of data structures. It will support operations on files that are numbered 0 to n. These files will be allocated space contiguously at initialization time. You can pre-fill the space with some printable characters. Since contiguous allocation requires information about the size of the file, we will assume that a file's size is no more than 8 blocks. The server will store these files on a "disk" that is implemented using a large Unix file. The block size on your disk could be small (e.g., 128 bytes). The initial disk blocks will store information about file allocation and free space available on the disk. To further simplify the project, we will assume that the unit of transfer between the server and clients (and hence the unit of caching) is an entire file.

The following are the calls that will be made by client nodes to access files.

 

    /* The myopen call must be the first call made to a file. This call gets the file ready for read and write operations. A value of 0 for  mode means a read, 1 means 
        write and other values are illegal. **/

        myopen(int filenum, int mode);

    /* A read call returns numbytes (when available) from the file that has number filenum into the buffer buf. */

 

        myread(int filenum, void *buf, int numbytes)

    /* A write call writes numbytes number of bytes from buf into the file. */

        mywrite(int filenum, void *buf, int numbytes);

    /* An application closes a file when it has completed its reads and writes. */

        myclose(int filenum);

In addition, an  InitLib() call can be used to initialize any data structures that are used by the library at the client node.

The memory cache of a cacher daemon can hold m pages. You can use small values of m (for example 32). When a client makes a request to its local daemon, it must first check if the desired block is in the cache. Otherwise, it must be fetched from another client daemon or the server.  A cacher must respond to requests of local application processes as well as messages that arrive from remote cachers or the server.

You need to setup communication among cacher daemons and the server. You need to define the application level communication protocols by specifying the types of messages needed and their contents. You can use TCP sockets or can build a reliable protocol on top of UDP sockets. For a demo, you can assume that there will be four client daemons running at different nodes and the server. The connections can be created once and then used until a client daemon shuts down. You can also assume that there can be at most 256 files. You also need to set up communication between application processes and their local cacher daemon. You can use shared memory between the two if you desire or you can use sockets for communication between processes running at the same node.

You must carefully design the application, cacher daemon and server sides to get the desired behavior. For example, the server may need to communicate with a cacher daemon to request that it forward a block. The daemon must be able to handle such messages as well as requests that come from local application processes. One way to handle this is to make the cacher daemon multi-threaded, and one thread takes care of communication with the server. The thread responsible for such communication must run on a separate kernel thread to avoid the blocking system call problem. Similarly, you need to decide if the server is single or multi-threaded and how the library and other application threads executes.

The T.A. will use a request set for each client application that your program must be able to handle correctly. This set will include a sequence of  myopen,  myread,  mywrite and myclose calls to a number of different files at each of the client nodes. You must demonstrate that the applications do read consistent data. The request set files

may include several kinds of commands that are processed by the application process at each node. The format of such files will be as follows.

open <filenum> <mode>

read <filenum> <num of bytes>

write <filenum> <write string>

close <filenum>

You will also need to study the performance of your implementation. In particular, you will need to measure and compare the execution times of various calls when they are executed with cooperative caching vs. when there are no caching daemons. You should be able to explain performance results that you obtain. 20% of the grade will be for the performance measurement part.

 

Grading Criteria

60%, cooperative caching and global replacement algorithms and their  implementation, including clear definition and implementation of the protocol that is used for coordination among the daemons and the server.

20%, performance measurement

20%, Code quality, demonstration