CS 6210 Advanced Operating Systems
Homework 2
Due Date: Thursday, April 12, 2001

 
  1. The xFS file system utilizes a number of techniques to build a scalable and fast distributed file system. These include cooperative caching, scalable consistency, log structured writes and RAID. Consider the following scenario in  xFS. A client C's cache is currently full and contains blocks that only exist in this client's memory. C experiences a cache miss and the block needed by it is available in the memory of a single remote client.
    1. Describe the data structures that are accessed by C and other nodes that may be contacted during the processing of its request. Explain how these data structures are maintained.
    2. Describe the messages that are exchanged between nodes to satisfy C's request for the block when it experiences a cache miss.
    3. Discuss the scenarios under which the response time experienced by C is shortest and longest?
    4. Will your answers change if the block for which C experiences the miss is available in multiple client memories? Explain.
  2. Distributed shared memory provides strong consistency which is not needed for many applications. For example, dynamic content on the Web can be cached (e.g., CNN home page which is frequently updated).  If a page containing dynamic information is cached, a time-to-live (TTL) field can be associated with it. A client cache refresh the page when its TTL expires. This approach, which differs from the invalidation based approach of DSM, could allow clients to access stale data but requires no synchronous invalidations. Consider a web server that services m clients between consecutive updates to the data in the page. These clients perform a total of n reads of this data (assume n >> m). Furthermore, assume that the TTL expiration time is 1/k of the average time between updates to the page. Is it possible to compute the average number of messages in this system for the invalidation based protocol that provides strong consistency, and the TTL based scheme? If yes, how do the message costs compare? If stale data can be tolerated, the TTL time may be greater than the time between updates. In this case, how will the message costs compare? Explain your answer.
  3. The continuity constrain that we discussed in class for multimedia systems only considered the time that is required to switch streams and move data from disk to the buffers that are accessed by the recording/displaying device.  If the device is attached to a remote node, the data has to be transferred over the network and processed by the CPU before it can be accessed by the device. Develop a model that allows you to derive a continuity constraint that accounts for the disk, network and CPU resources that are utilized by the multimedia application. Feel free to make simplifying assumptions when necessary.
  4. We discussed a network packet scheduler that schedules packets based on their expected delay (e.g., packet with minimum value for delay is scheduled next). The paper states that this algorithm is also called a time-stamp based scheme. Explain why scheduling based on time-stamp of packet arrival is equivalent to this algorithm.
  5. The Internal Revenue Service (IRS) is encouraging people to file their taxes over the Internet. Assume a client-server architecture where tax filers are clients and the requests are handled by the IRS server. Clearly, the server needs to be scalable as it must handle a large number of requests around April 15. You have been asked to design such a server. It should be possible to increase the capacity of the server as demand for its services increases. Also, the server must be highly available and should reliably maintain information about tax returns. Develop a cluster based architecture to implement the IRS server. You should decompose the service functionality and discuss where it can be executed in the cluster. You should also address availability, load balancing and data integrity issues. Discuss the strengths and weaknesses of your design.