CS6210 : Project 2

Building a Distributed File System

 

Background

 

To complete this project, you would need to be cognizant with the following papers.

 

  1. Nelson, M.N., Wlech, B.B., Ousterhout, J.K., "Caching in the Sprite Network File System", ACM Transactions on Computer Systems, 6, 1, pgs. 134-154, February 1988. (self study)
  2. Feeley, Morgan, Pighin, Karlin, Levy, Thekkath,, "Implementing Global Memory Management in a Workstation Cluster", Fifteenth ACM Symposium on Operating System Principles, Dec. 1995

 

In the writeup below, I will be using terms which you may not have encountered yet, but will study in class soon. You MUST read the papers above to even start on this project.

 

Due Date

 

Friday, Oct 15 by midnight of Friday. ( To clarify, that is end of Friday and before Saturday begins! )

 

Description

 

In this project, we are going to build a simple distributed file system (DFS) that will cache file data in memories of other clients as is done in cooperative caching. The system will try to keep copies of a dirty block  in the memories of at least two nodes so the data is not lost due to a single client failure.

 

Naming and Server Design

 

To avoid problems that arise in translating file names and locating the file server for a named file, we will assume a single file server, and low level names, which range from 0 to n. The file server runs at a well known location. The file server will make use of a number of data structures. It will support operations on files that are numbered 0 to n. These files will be allocated space contiguously at initialization time. You can pre-fill the space with some printable characters. Since contiguous allocation requires information about the size of the file, we will assume that a file's size is no more than 8 blocks. The server will store these files on a "disk" that is implemented using a large Unix file. The block size on your disk could be small (e.g., 128 bytes). The initial disk blocks will store information about file allocation and free space available on the disk. To further simplify the project, we will assume that the unit of transfer between the server and clients (and hence the unit of caching) is an entire file.

 

Caching Design

 

Data blocks that are of interest to other client nodes may exist at a node even after applications accessing the file at the node terminate, we need to keep a separate file cacher daemon running at each node. An application process must include a library that translates file requests made by the application and sends them to the local cacher daemon. The local daemon handles such requests and also communicates with the file server or daemons running at other nodes. To provide improved performance and better scalability, we will require that files be cached cooperatively by the cacher daemons. Furthermore, to improve availability, the cachers will attempt to keep at least two copies of dirty blocks at two different nodes. Since availability is provided by multiple memory copies at nodes that have independent failure modes, it is not necessary that dirty data be written back to the server quickly to tolerate failures of client nodes. The caches at client nodes will be managed by the cacher daemons running at each client node.

 

Replacement Policy

 

Since space in client caches will be limited, you need to devise a scheme to keep copies of those blocks that have recently been accessed by application processes. For example, if a block is to be replaced, it is best to replace one that has more than two copies in cacher memories. If no such blocks exist, it is desirable that the block that has not been accessed for the longest time is replaced. If a dirty block is replaced, it must be sent to the server. Once the server gets a dirty block, it becomes clean for the clients that have its copies. Clearly, cacher daemons at client nodes and the server will have to keep and use meta-data in order to determine when and what blocks should be replaced.

 

Cacher Daemon Design

 

The memory cache of a cacher daemon can hold m pages. You can use small values of m (for example 32). When a client makes a request to its local daemon, it must first check if the desired block is in the cache. Otherwise, it must be fetched from another client daemon or the server.  A cacher must respond to requests of local application processes as well as messages that arrive from remote cachers or the server.

You need to setup communication among cacher daemons and the server. You need to define the application level communication protocols by specifying the types of messages needed and their contents. You can use TCP sockets or can build a reliable protocol on top of UDP sockets. For a demo, you can assume that there will be four client daemons running at different nodes and the server. The connections can be created once and then used until a client daemon shuts down. You can also assume that there can be at most 256 files. You also need to set up communication between application processes and their local cacher daemon. You can use shared memory between the two if you desire or you can use sockets for communication between processes running at the same node.

You must carefully design the application, cacher daemon and server sides to get the desired behavior. For example, the server may need to communicate with a cacher daemon to request that it forward a block. The daemon must be able to handle such messages as well as requests that come from local application processes. One way to handle this is to make the cacher daemon multi-threaded, and one thread takes care of communication with the server. The thread responsible for such communication must run on a separate kernel thread to avoid the blocking system call problem. Similarly, you need to decide if the server is single or multi-threaded and how the library and other application threads executes

 

Filesystem API

 

The following are the calls that will be made by client nodes to access files.

 

    /* The myopen call must be the first call made to a file. This call gets the file ready for read and write operations. A value of 0 for  mode means a read, 1 means 
        write and other values are illegal. **/

        myopen(int filenum, int mode);

    /* A read call returns numbytes (when available) from the file that has number filenum into the buffer buf. */

                                myread(int filenum, void *buf, int numbytes)

    /* A write call writes numbytes number of bytes from buf into the file. */

        mywrite(int filenum, void *buf, int numbytes);

    /* An application closes a file when it has completed its reads and writes. */

        myclose(int filenum);

 

In addition, an  InitLib() call can be used to initialize any data structures that are used by the library at the client node.

 

Testing and Submission Policy

 

I will use a request set for each client application that your program must be able to handle correctly. This set will include a sequence of  myopen,  myread,  mywrite and myclose calls to a number of different files at each of the client nodes. You must demonstrate that the applications do read consistent data. The request set files

may include several kinds of commands that are processed by the application process at each node. The format of such files will be as follows.

open <filenum> <mode>

read <filenum> <num of bytes>

write <filenum> <write string>

close <filenum>

You will also need to study the performance of your implementation. In particular, you will need to measure and compare the execution times of various calls when they are executed with cooperative caching vs. when there are no caching daemons. You should be able to explain performance results that you obtain.

 

You can develop the filesystem in a Linux or Solaris environment, just explicitly mention it in the writeup. The Due Date is Midnight of Friday, Oct 15.

 

You will need to submit the following to the TA