Systems Area Questions for Qualifier

Systems Area Questions for Qualifier

Fall 2000

Answer 6 of 9 questions

  1. Lamport Clock (LC), which is a logical clock, can be implemented very efficiently because each node needs to maintain a counter and messages exchanged by processes only need to carry a single value read from the sender's counter. Unfortunately, if communication between processes is infrequent, the clock value at one process does not meaningfully describe the progress of other processes. One way to address this problem is by adding periodic communication between nodes. Such heartbeat messages are exchanged between nodes even when the applications running on them do not send messages. We call this new clock the "Live Lamport Clock" or LLC.

    Consider the following problem. A set of clients make requests for a service to a server. Each request is assigned a timestamp based on the clock value of the client at the time its request is made. We require that the server execute requests in increasing order (e.g., once it executes a request with timestamp t, requests executed in the future should have timestamps equal to or higher than t). Is it possible to use timestamps read from either LC or LLC to implement an ordering algorithm at the server which allows it to order requests in increasing order without sending any additional messages? If your answer is yes, present the algorithm(s) and discuss their correctness. If the answer is no, develop an algorithm that may require additional communication between the server and clients but will order requests in increasing timestamp order.

  2. Consider a distributed system which consists of several resourceful servers and a large number of thin clients. Assume that the thin clients only have volatile storage and must store the long term results of their computations at one of the servers if the results must survive failures. A computation in such a system may span many servers as well as thin clients. For example, multiple users could be sharing files or work with collaborative applications such as joint schedules or designs. The applications both read and update the shared state. To deal with failures, we want to provide some sort of transaction support. The log managers, that log volatile data and transaction status records, only run on the servers. Design a transaction system for such an environment and discuss its implementation. What are the primary performance concerns for such a system and can they be addressed by using a custom transaction model rather than the standard one that is commonly employed in more homogeneous systems.

  3. Protection models and protection matrices are implemented in many operation systems as a way of protecting objects from unauthorized access by subjects. Define what subjects, objects, and protection matrices are. Explain how a protection matrix is used to implement a protection model.

    The unix operating system treats I/O devices, processes, and many other objects as files. Unix files have access rights associated with them. Describe the access rights associated with unix files. What are the security related advantages and disadvantages of the unix protection model?

    A subject and the rights associated to it defines a protection domain. The rights given to a subject should follow the principle of least privilege. Define what the principle of least privilege is. Sometimes a subject needs to perform a task with more, or less, rights than initially associated with it. So, many OS's implements protected procedures that switch to a protection domain different from the one of the subject that called it. What are the problems associated with protection domains switching? What are the mechanisms in Unix to switch to another protection domain? Is there any security problem in Unix that is related to protection domain switching?

    1. Describe in detail the ordering semantics typically used in parallel discrete event simulation programs. What constraints are usually placed on each simulation process with respect to this ordering?

    2. Why aren't other ordering semantics typically used in parallel discrete event simulations? For example, much effort in the distributed computing community has gone into defining and implementing causal ordering. Why isn't this used? Also, distributed virtual environments typically use unordered communication. Why isn't this used for parallel discrete event simulation programs?

    3. Why don't distributed virtual environments such as Distributed Interactive Simulation (DIS) typically use the ordering semantics used for parallel discrete event simulation? Should they?

  4. Distributed systems are prone to failures due to a myriad of reasons. As computing becomes pervasive, and as applications become more complex (such as B-2-B commerce) it is imperative that distributed systems be built robust enough to tolerate and recover from failures. The problem is always the classic one of optimizing for performance versus designing for robustness. One school of thought is not to provide any mechanisms for robustness and let the applications design their own failure tolerance and recovery. Client-server distributed systems have been built in the past that provide a variety of mechanisms ranging from low-level ones at the transport level such as timeouts and virtual circuits, to high-level ones such as replication of computation and state. Atomic transaction at the operating system level has also been proposed by some systems as an intermediate solution between these two extremes.

    Imagine you are building distributed systems mechanisms for B2B types of applications. Your goal is to provide acceptable performance while at the same time providing support for fault-tolerance and recovery. In this question, enumerate what the issues are at the application level, explore the space of possible solutions, suggest a system architecture that embodies a solution approach, and argue qualitatively why your architecture is something we should bet our meager savings on.

  5. Software fault isolation (SFI) or "sandboxing" is a technique used to enforce certain constraints on programs.

    1. Describe the technique.

    2. For what kinds of constraints is SFI appropriate; for what kinds of constraints is it inappropriate?

    3. Compare SFI to the Java virtual machine, JVM, as a technique for achieving run-time safety.

  6. Parallel systems typically offer some form of coordinated scheduling of processes on different processors (coscheduling or gang scheduling), yet also typically make such scheduling an option. What issues make coordinated scheduling desirable or undesirable? Answer with specific scenarios that perform well and poorly under each scheduling discipline. Suggest approaches for performing best under all circumstances.

  7. Various researchers argue that future computer systems can expect to spend relatively little of their time in traditional "computing" applications and the bulk of their time in "communication" applications -- processing streams of data: multimedia communication applications, web server applications, sensor processing applications and the like. If true, what aspects of operating system mechanisms, policies and/or structure are likely to change as a result of this long-term shift in application focus? Consider at least scheduling, memory management and protection.

  8. This question pertains to software and hardware issues in memory management in modern operating systems (such as Linux and NT) on modern processors such as DEC Alpha and Intel P-III.

    1. Give a detailed description of memory management. Your answer should cover both the hardware and software issues in memory management. Explain what aspects are handled typically in hardware and what aspects are handled typically in software. Explain why such a division of labor between hardware and software is reasonable or unreasonable. If an aspect can be handled in either software or hardware, then what are the pros and cons for each approach.

    2. Page tables implemented to date include single-level, multilevel forward-walked, multilevel reverse-walked (i.e. single-level but with the page table itself in virtual memory) and inverted. What are the pros and cons of each organization? What do future trends in technology and applications bode for the different organizations?

    3. Using the memory hierarchy of any modern processor (physical memory, virtual memory, page tables, TLB, L1, and L2, caches) walk through what design decisions you will have to make in designing the memory management piece of the operating system.