2 CS6420 - Programming Assignment 3

CS6420 - Programming Assignment 3

Overview

For this assignment we will be implementing a "Serverless Distributed File System" called "GFS", such as is described in the paper Serverless Network File System. We will assume a fixed number of systems (5 in our case), and that every system has some files stored locally on a directly attached disk drive. We will assume a fixed sector size of 512 bytes, and that each system has sufficient memory to cache up to 10 sectors. The idea behind our file system will be quite simple. If a given system A needs to read file F1, sector 0, it will first scan it's local cache to see if it already has a copy. If so, this is a "local cache hit", and we can provide the data with a simple memory to memory copy.

If there is no local cache copy, we simply ask the system which has file F1 on his local hard disk (say system B for this discussion) to provide the data for us, using sockets and network reads/writes. We will assume a simple mapping from file name to which system to as described below, although in a real distributed file system this would be a more complex task. System B will first scan his own local cache to see if it has the sector cached, and if so will simply provide it (again using sockets and network reads/writes) from his memory copy. This is called a "remote cache hit".

If system B does not have a cache copy, things get slightly interesting. Every system should remember what other systems have asked for a copy of any local disk sectors. When B finds no local cache copy of sector F1/Sec 0, it will see if it has provided this sector to any other system in the past. If so, it will forward the request on to that other system (say system C for this discussion), and ask C to provide F1/Sec0 to system A. When system C sees the request from B, it will verify it does indeed have F1/Sec0 cached and if so provide it to A via the network. This is called a "secondary remote cache hit".

Finally, if system B has no record of any other system having a copy of F1/Sec0, it will simply read his local disk and provide the data to A after the read completes. This is called a "cache miss". We won't actually be reading or writing any disk in our programs, but we will simulate the delay caused by reading/writing by doing a "sleep(1)" in our program. I agree that 1 second is much too long to really simulate a disk read, but it will help us see if our caching system is really working properly.

System A will ultimately receive a copy of F1/Sec0 per the above discussion. System A will then cache this copy of F1/Sec0 in case he needs it, or in case other systems may need it as described above. If A's cache is full (recall we said that each system only has sufficient memory to cache 10 sectors), it must discard something. For our program, we will choose a simple LRU algorithm and simply discard the head of the cache list (oldest entry). However, when we do this, we must inform the server that provided us this data that we no longer have this availble so it won't forward requests on to us. This is called a "cache discard".

Things also get slightly more interesting when we do disk writes. When we write a sector, we must inform that system that has the sector locally that it has changed (and at the same time provide it the changed data), and that system must also inform all other systems that have it cached that their cached copies are no longer valid. This is called "cache invalidation".

For our purposes, we will use a simple mapping of file name to system as follows: File F0 will be on system 0, F1 on system 1, etc. Each system will be assumed to have only one local file and it will consist of exactly 100 sectors.

In order to verify that the correct sector has been located and processed, we will initialize the contents of each sector as follows: (1) Since each sector is 512 bytes, we can consider it to be an array of 128 integers (4 bytes each). (2) For each of the 32 bit integers, set the upper 8 bits to the file number (in the range 0 - 4) and the lower 24 bits to the sector number (in the range 0 - 99). Here is a subroutine to initialize a sector to the correct values.

You will want a doubly linked list to manage your cache memory, since you may need to remove a node from the list and re-insert it at the head or tail (this would happen if you kept the list in LRU order, and an existing entry was referenced, causing it to now become the most recently used). dlist.c contains some sample routines to manage a doubly linked list if you've never done so before.

Which system to use

We will again use "elvis" for compilation and testing, and will use boise, cheyenne, dallas, eugene and frankfort for the 5 systems participating in the distributed file system. Elvis will simply initiate processes on those other five systems and monitor their progress. "startfs" will handle the remote process initiation for you so you don't need to be too concerned about this.

Coding the Program

I will provide the following: A Makefile, a main program testfs.c, a monitor program startfs.c which will run on elvis and start up the other 5 systems, monitor progress, report results, and help me grade the programs (and help you debug them), a script findfs that will see if you have any left over processes on those systems which did not terminate (a very likely occurrence in the early debugging of this program), and a script killfs that will kill these leftover processes. I will also provide gfs.h which will describe the api you need to implement, and a skeleton gfs.c. Use the following test cases: (This is all...no more to follow!) test1, test2, test3, test4, test5, test6, test7, test8. File prog3.tar.gz contains all the above programs and test cases if you just want to do one download.

Specifications

The details of the specification are found in gfs.h.

Some Things to Consider

We have to decide whether to use TCP sockets or UDP sockets for the network activity. Both methods have advantages and disadvantages. I believe UDP will be a simpler choice in the long run. After coding up my solution to this problem, I decided to not require the "Reliability" using UDP. In other words, you can assume for this assignment that UDP datagrams will show up at the destination correctly.

You will undoubtedly need a multi-threaded program for this to work properly. If you system is off doing a disk read (recall we will simulate this by just sleeping for 1 second), we want to be able to continue to service other requests which might be satisfied by cache hits while this is happening. I strongly suggest you use pthreads for this.

In the thread that services network requests, we will allow bypassing the 1 second delay if you have to simulate a disk read. To do this properly, we would have to have yet another thread which handled disk requests, and this thread would have to know what to do with each disk block read. For this assignment, in the network request processing thread, you can assume a disk read/write is instantaneous.

Compiling and linking your program

Use the Makefile I will provide. This is what we will be using to compile and test your implementation.

Testing Your Program

We will test your program simply by running the test cases I provided in testfs.c (as of this writing those are not complete, but will be complete shortly). Grading criteria will also be provided. A sample output for tests 1 through 6 is here. You can see if you get the same number of hits and reads/writes as I get.

Grading Criteria

Test 1 10%
Test 2 10%
Test 3 10%
Test 4 10%
Test 5 15%
Test 6 15%
Test 7 10%
Test 8 10%
Quality of Code (Neatness, commenting, etc) 10%

Turning In Your Program

Email your "gfs.c" program (not as an attachment, but just as the body of the message) to rly@cc. One way to do this is (on unix):

mail rly@cc < gfs.c

Contact Information:

riley@cc.gatech.edu
College of Computing
Georgia Institute of Technology
Atlanta, GA 30332-0280

Last Modified: Jun. 9, 1998