Homework 1

Distributed Computing

Due Date: Monday, April 26, 1999

 

  1. The ordering of operations or events is an important problem in distributed computing systems. If a synchronized real-time clock was available to all processes, events timestamped with such a clock can be totally ordered. On the other hand, with a logical clock such as a Vector Clock, one can only establish a partial order between events. Assume that a real-time clock C can be provided to the processes but it is not perfectly synchronized across nodes. In particular, the maximum drift across nodes is bounded by d. Also, message delivery time is bounded by two quantities. A message sent from node P to Q takes at least min time and no more than max. Answer the following questions for this system.

 

 

  1. To correctly detect concurrent events, vector clocks require N components where N is the number of nodes in the distributed system. Present an informal proof that shows that in general, it is not possible to detect causal events with less than N components in the vector clock.

    Now consider a particular system that has m servers and n clients (m << n). It is known that clients never communicate with each other directly. Thus clients only exchange messages with servers. Servers can communicate with clients and other servers. Is it possible to design a more efficient logical clock system than the traditional vector clocks for such a system and still be able to detect all concurrent events? Explain your answer.
  2. In a protocol such as NTP, a node must determine how frequently it should synchronize with a more accurate time server. This frequenly depends on a number of factors, including clock drift rates, message delivery time distributions, and the bound on clock skew. Formulate the problem of determining synchronization frequency and derive approximate bounds by making simplifying assumptions.
  3. The Chandy/Lamport algorithm for recording global states makes no assumptions about clock drifts or message delivery times. If it was known that nodes have a perfectly synchronized clock, and message are delivered in time d, a message can be sent by a process to notify all others that they should all record their states at time t > current-time+d. How can channel states be recorded when such a scheme is employed to record global states? Explain your answer.
  4. There are several applications of distributed snapshot like algorithms. In this question, we want to explore debugging of distributed programs, and distributed checkpoint and rollback recovery. In particular, you need to answer the following questions.