SYSTEMS AREA PH.D. QUALIFYING EXAMINATION
SPRING 2004
IMPORTANT INSTRUCTIONS:
1. Please attempt any six out of the following nine questions.
The exam has total of 5 pages and 9 questions.
2. Please attempt all the subparts of a question (marked (A), (B), (C)
… etc.) unless otherwise stated explicitly (in these cases follow
individual instructions embedded in the given question overriding this
general rule).
3. Make and state assumptions wherever necessary.
4. Please keep your answers precise and to-the-point. Clarity is
important.
5. Please return the answers in the printed form. If written by hand,
please write cleanly and legibly.
1. Distributed Simulation
(A) Consider the Chandy/Misra/Bryant null message algorithm. As
originally proposed, this algorithm assumes the topology indicating
which logical process can send messages to which other processes is
static. Explain what difficulties arise when the topology is
allowed to change dynamically during the execution of the distributed
simulation. Propose extensions to this algorithm to allow dynamic
connections between processes. State what constraints (if any)
must be imposed on the application to realize your algorithm.
(B) Are dynamic connections a problem in Jefferson's Time Warp
algorithm? Explain why or why not.
2. Protocol Construction
Academia and industry have made intensive efforts to automate protocol
construction, typically using some combination of micro-protocols and
of formal methods used for micro-protocol composition.
(A) Briefly outline the current state of the art, including commenting
on the role of formal techniques.
(B) Talk about two basic problems that arise in micro-protocol
construction: 1) there are some specific problems in generating high
performance code for composed protocols vs. protocols written and
compiled as single pieces of code (e.g., using a standard compiler).
Identify at least two and comment on their solution. 2) there are some
distinct advantages of using micro-protocol composition vs. other
approaches to protocol construction. Describe at least two specific
ones.
3. CPU Reservation
The idea of this question is to be a bit creative, taking the next step
in the work on CPU reservation (for multimedia) to which you have been
exposed. Consider the following: rather than reserving CPU for
multimedia, you are asked to build a reservation mechanism that
reserves MULTIPLE resources for multimedia and next, you are asked to
extend that mechanism from single to multiple machines. After all, for
realistic multimedia applications, performance concerns are end to end.
(A) Can you easily generalize the existing CPU reservation mechanism to
dealing with multiple devices?
(B) What issues must be addressed for extending it to multiple
machines? Identify and describe solutions to multiple such issues.
4. Cluster-based Web Cache System
Your new company, SIM, provides cluster-based software for ISP's that
allows them to do simple firewalling and Web caching. A customer
installs a SIM caching cluster according to their expected user
population size. Under normal conditions (typical Web workloads,
no undue network congestion effects, etc.), a single cluster node can
handle about 60 hits per second.
Some of your more sophisticated customers tell you that they would like
to be able to "augment" the caching with additional capabilities. For
example, @mobi.com has many users who use PDA's who access the Web
through their ISP cluster, so in addition to fetching and caching a
page, they would like to be able to perform simple transformations on
data, depending on whether the client is a Windows PDA, Palm PDA, or
regular desktop browser (IE/Netscape). So you agree to design an
API that will allow them to insert "plug-ins" into the cache stream.
Below is the simplified control flow of your existing
architecture. Your software architecture is concurrent: a single
thread handles the incoming request, but after the 'get page from web'
procedure returns, the thread forks into two so that the "store in
cache" and "deliver to client" code paths can be executed in parallel.
Please choose to answer either (A) and (B) or (A), (C) and (D)
from the following. Write down your answers to the best of your
knowledge.
(A) Redraw the diagram to indicate where a customer-supplied
transformation module could fit in the control flow. (There may
be more than one place where it would work.)
(B) In the original logic diagram above, all of your functional blocks
correspond to methods inside a single process. (The "store in
cache" and "deliver to client" blocks can be executed by different
threads, but they are user-level threads in the same process space.)
@Mobi.com wants to know if their plug-in module will run
(a) in the same process space as a separate procedure call that is part
of ,
(b) in the same process space but as a separate (forked) thread,
(c) in a different process space.
Thoroughly enumerate the pros and cons of each approach. Consider
performance, ease of programming, capacity provisioning, and failure
management. Which approach (a, b or c) would you ultimately
recommend and why?
(C) One day, the recording industry companies have a vision in a dream
telling them to put all of their content online for free. Music
fans immediately begin downloading MP3 files. Make and state reasonable
quantitative assumptions regarding the relative sizes of a typical MP3
file compared to other Web workloads, and estimate the reduction in
capacity of your cluster when serving this workload. (Hint:
Assume that under the new workload, about 80% of the bytes transferred
correspond to MP3 files. You'll need to compare this with a
defensible assumption about the old workload, i.e., an assumption based
on some recent data.) Again, clearly state your data
sources and path of reasoning.
(D) MP3's are a lot bigger than images and Web pages, and soon enough,
@mobi.com's cache disks all fill up. (Their system is still
passing requests through, but the cache is thrashing: every time a new
file is brought in from a server, it causes the eviction of an existing
file that will probably be referenced again soon.) @ mobi.com's
administrator is upset, and tells you he is ready to just turn off the
cluster to save electricity. "No!" you say. "There's still
some benefit from having the cache operating, even if it is thrashing!"
Please describe, to the best of your knowledge, what's the benefit, and
who will benefit.
5.Kernel/User-level Interactions:
A lot of papers in the literature have concentrated on improving the
interactions between a kernel scheduler and user-level communication
(e.g., IPC) or synchronization (e.g., locks). For instance, the
scheduler could preferentially schedule the recepient of an IPC call as
soon as the thread that started the IPC is blocked. Or a user-level
thread library could know that an LWP is near the end of its time slice
and adjust its behavior.
Give a short survey of a few techniques along these lines. You should
name at least three different techniques, citing the respective
publications and explaining the essence of each technique. Discuss why
these techniques are representative of other approaches, and present
the tradeoffs involved.
6. Distributed Shared Memory:
There have been a number of papers in the late 80's and early 90's that
were concerned with software distributed shared memory (DSM). DSM has
been used as a programming abstraction for clusters, and as a system
structuring concept in distributed operating systems such as Clouds.
Answer the following questions regarding DSM:
(A) What is appealing about DSM as a programming abstraction? Discuss
the sources of performance advantages and disadvantages. Give code
samples to illustrate your points.
(B) What is appealing about DSM as an operating system structuring
mechanism? Discuss the sources of performance advantages and
disadvantages. Give code samples to illustrate your points.
(C) What is the state of DSM research today? Give a commentary on
the
current state of DSM research.
7. Distributed Systems
There are many distributed applications where information created at
one node, called an "update", has to be disseminated to a large number
of other nodes reliably. Obviously, if multicast is supported, an
update can be sent as a multicast message. However, reliable multicast
suffers from scalability problem because of a number of reasons. One
way to improve scalability is to use "epidemic" style protocols where
an update a sent to a small number of nodes which forward the update to
others and so on. Answer the following questions when such an epidemic
dissemination algorithm is used to propogate an update to all nodes.
(A) If a given node wants to send an update to only
a constant number of other nodes which may be chosen randomly, how many
rounds will ensure that most nodes have received the update?
(B) If updates can be injected into the system at
multiple nodes, what kind of ordering should be imposed across them?
How can such ordering be implemented in a large scale system?
(C) It is possible that some of the nodes can be
compromised and such nodes can try to disrupt dissemination or can even
attempt to introduce corrupted updates into the system. Is it possible
to handle such faulty nodes when their number is bounded? If your
answer is yes, sketch a protocol that would achieve this goal and
specify the bound assumed on the nodes that could be faulty.
8. Real-Time Systems
Although the 1997 Mars Pathfinder rover was generally considered a
success, it had a series of resets that slowed down the mission for
several weeks. As it turned out, the resets were due to a
priority inversion problem. A high priority task (data
distribution that broadcasts collected data through a shared bus) is
suspended pending access to a shared resource. As it happened, a
lower priority task also uses the same resource. When a number of
intermediate priority tasks interrupted the low priority task, the high
priority data distribution task misses its deadline. Since the
data distribution task is considered critical, missing its deadline
causes the system to reset.
(A) Explain the general conditions for a priority inversion problem
such as Pathfinder’s to happen in operating systems.
(B) Outline a solution to solve the priority inversion problem.
Explain how your solution will prevent the conditions above (item 1)
from happening and how it solves the Pathfinder priority inversion
problem.
9. Specialization
It is common for system programmers to optimize the common case
(executed many times) by manually writing an optimized path or by
specialization using tools such as a partial evaluator.
Specialization differs from manual optimization by identifying
explicitly the invariants that hold for the common case and then
simplifying the code under the assumption that the invariants are
true. The specialization process is complete when guards are
included to detect the situations when the invariants are violated and
the specialized code is no longer valid.
Our concrete example is dynamic linking of kernel and library modules,
a standard technique supported by most modern operating system
kernels. A naïve implementation of dynamic linking contains
inefficient code. For example, at linking time, the system looks
up the module to be linked and loads in the code to be linked.
Each time the module is invoked, the system looks up the module’s
location in memory and then branches into the module.
Explain how you can use specialization to eliminate the sources of
inefficiency in the naïve implementation of dynamic linking.
(A) First, identify the common case.
(B) Second identify the focus of specialization by making explicit the
invariants being defined.
(C) Third, explain how the execution path of the common case can be
shortened by exploiting the explicit invariants.
(D) Fourth, explain how you will guard the specialized code by
detecting the violation of invariants and switching the execution to a
general case.