Systems Fall 2002 Qualifier Exam
INSTRUCTIONS:
Please answer any 6 out of 9 questions
Give detailed answers to questions mentioning all the assumptions you are
making and be comprehensive in your answers showing the details of your approach
and understanding of the problem.
1. Describe a distributed simulation application program where the Time Warp
algorithm will perform well, but the Chandy/Misra/Bryant null message algorithm
will perform poorly. Explain your answer.
Similarly, describe a distributed simulation application program where you
would expect a conservative synchronization algorithm will likely outperform
Time Warp.
2. Once you remove all the window dressings, a "parallel machine"
such as the Origin 2000, a "cluster" such as our own Jedi, or a "distributed
system" such as the set of machines in a geographical area (say the college
of computing), each consists of a set of processors glued together by an
interconnection network.
(a) What is fundamentally different about each of these
classes of machines?
(b) What OS issues are common to all of them?
(c) What OS issues may be relevant in one class of machine
and irrelevant in another?
(d) What are the characteristics of applications that
may be appropriate for each of these classes of machines?
3. Since nodes in distributed systems have independent
failures, it is possible to make a service fault-tolerant by replicating
it at several nodes. The replicated state of the service can be kept
consistent by using protocols that perform read and write operations with
a majority of the replicated instances. Outline the design of a file server
that replicates files at multiple nodes to enhance data availability.
Discuss how such a design may impact performance enhancing mechanisms such
as caching and log-structured file systems.
Unfortunately, compromises of nodes is a more serious problem than crash
failures of server nodes. A compromised node can attempt to corrupt the data
as well as meta-data that is stored at the compromised server. Also, a compromise
that exploits some vulnerability can use a common attack to compromise multiple
instances. Assuming that the number of compromised nodes is bounded by some
threshold, is it possible to design read and write protocols that provide
consistency semantics similar to file systems such as Sprite and xFS? If
your answer is yes, describe the protocols. Otherwise, present arguments
why this may not be possible.
4. Real-Time Systems Question: One of the main goals in real-time scheduling
is to preserve the predictability of CPU schedules despite the execution
of concurrent threads that may interfere with each other. An important
problem in this environment is called priority inversion, where a thread
with higher priority may block, waiting for a lower priority thread running
in a critical section. One solution proposed for solving the priority
inversion problem is called priority inheritance, where the executing thread
inherits the waiting thread's (higher) priority and runs to completion of
the critical section. Explain how priority inversion may be implemented
as an addition to a normal kernel (without other real-time support).
Hint: the critical section code and possibly the scheduler code may need
enhancements.
5. The basic definition of an operating system states that its role
is to `provide support for applications and insulate them from underlying
hardware'. Therefore, since application needs change over time, so will operating
systems. A good example is the inclusion of web servers (i.e., the khttp
stack) in modern OS kernels. An alternative solution, the sendfile system
call, give user level web servers the ability to directly transfer a static
file from a disk to a network device, without further involvement of the
user-level web server, thereby eliminating unnecessary data copying and user/kernel
crossings.
A) Discuss the advantages/disadvantages of the khttp-based vs. sendfile-based
OS support for web services. In particular, arguments have been made that
khttp is unnecessary, since sendfile replaces it.
B) Neither khttp nor sendfile in a server address the increasing importance
of symmetric communications, as experienced in peer to peer systems (e.g.,
Instant Messaging). In the remainder of this question, you will design some
OS support for IM:
a) Imagine a 'transferfile' command executed jointly by client
and server, which directly transfers a file from one IM user to another (e.g.,
to share an image between both). Describe the necessary components of 'putfile'
on the two interacting users' machines. Note that TWO
OS kernels are involved in this transfer, so the two collaborating user programs
will both have to make different system calls, with the joint effect being
the transfer of a file directly from one machine's disk to another one's.
b) Describe the APIs of both OS's portions of the transfer file command.
c) Discuss when and whether performance advantages can be attained by
use of your new OS facilities vs. simply taking the same
actions at
user level.
6. A basic issue with middleware is its flexibility, that is,
its ability to provide the types of services end user applications need.
One topic concerning such flexibility is communications. This is why RMI
(remote method invocation) in Java, for instance, has an underlying facility
called a socket factory. Notwithstanding what exactly Java's socket factory
does, describe in your own words the various communication flexibility you
consider important for distributed applications written with middleware like
Java. However, rather than doing this in general, be specific, by first identifying
application domains, then commenting on the necessary flexibility.
a) Comment on: - multimedia applications (identify at least three
issues)
- scientific applications (identify two different issues)
- publish/subscribe applications (identify at least one
additional issue).
b) Comment on how you would implement your socket factory to be sufficiently
general to address the various application needs identified in a). No need
for design details here, just a discussion of what's easily vs. not easily
implemented at that level of abstraction.
7. All modern computer systems use paged virtual memory with separate
address spaces for every process. Periodically, designers get bored
with this pattern and propose alternatives, e.g. placing all the processes
into a single address space and protecting them from each other with the
per-page protection bits instead of via separate namespaces.
a. List the advantages and disadvantages of per-process address spaces.
b. One major issue in a single-address-space machine is how to locate
and/or relocate a program in the address space. How might you solve
this problem at build time, at load time and at runtime?
c. How would a single address space OS simplify or complicate the require
hardware support for virtual memory? I.e. what, if anything, would
TLBs, caches, I/O systems, have to or be able to do differently?
8. Cluster-based Web Cache System
Your new company, SIM, provides cluster-based software for ISP's that allows
them to do simple firewalling and Web caching. A customer installs
a SIM caching cluster according to their expected user population size.
Under normal conditions (typical Web workloads, no undue network congestion
effects, etc.), a single cluster node can handle about 60 hits per second.
Some of your more sophisticated customers tell you that they would like to
be able to "augment" the caching with additional capabilities. For example,
@mobi.com has many users who use PDA's who access the Web through their ISP
cluster, so in addition to fetching and caching a page, they would like to
be able to perform simple transformations on data, depending on whether the
client is a Windows PDA, Palm PDA, or regular desktop browser (IE/Netscape).
So you agree to design an API that will allow them to insert "plug-ins" into
the cache stream.
Below is the simplified control flow of your existing architecture.
Your software architecture is concurrent: a single thread handles the incoming
request, but after the 'get page from web' procedure returns, the thread
forks into two so that the "store in cache" and "deliver to client" code
paths can be executed in parallel.
Please answer Question A and then choose to answer one of Question’s B, C,
and D. Please write down your answers to the best of your knowledge.
Question A:
Redraw the diagram to indicate where a customer-supplied transformation module
could fit in the control flow. (There may be more than one place where
it would work.)
Question B:
In the original logic diagram above, all of your functional blocks correspond
to methods inside a single process. (The "store in cache" and "deliver
to client" blocks can be executed by different threads, but they are user-level
threads in the same process space.) @Mobi.com wants to know if their plug-in
module will run
(a) in the same process space as a separate procedure call that is part of
,
(b) in the same process space but as a separate (forked) thread,
(c) in a different process space.
Thoroughly enumerate the pros and cons of each approach. Consider performance,
ease of programming, capacity provisioning, and failure management.
Which approach (a, b or c) would you ultimately recommend and why?
Question C:
One day, the recording industry companies have a vision in a dream telling
them to put all of their content online for free. Music fans are immediately
begin downloading MP3 files.
Make and state reasonable quantitative assumptions regarding the relative
sizes of a typical MP3 file compared to other Web workloads, and estimate
the reduction in capacity of your cluster when serving this workload.
(Hint: Assume that under the new workload, about 80% of the bytes transferred
correspond to MP3 files. You'll need to compare this with a defensible
assumption about the old workload, i.e., an assumption based on some recent
data.) Again, clearly state your data sources and path of reasoning.
Question D:
MP3's are a lot bigger than images and Web pages, and soon enough, @mobi.com's
cache disks all fill up. (Their system is still passing requests through,
but the cache is thrashing: every time a new file is brought in from a server,
it causes the eviction of an existing file that will probably be referenced
again soon.) @ mobi.com's administrator is upset, and tells you he
is ready to just turn off the cluster to save electricity. "No!" you
say. "There's still some benefit from having the cache operating, even
if it is thrashing!" Please describe, to the best of your knowledge, what's
the benefit, and who will benefit.
9. Many mechanisms have been proposed for developing Operating Systems
in a modular form. Approaches range from micro-kernels and exo-kernels to
the more pedestrial Linux modules mechanism.
-What is the intended advantage of all such modularization mechanisms?
-What is the advantage of a micro-kernel over something like the Linux
modules mechanism?
-Describe the main challenge of externalizing parts of the kernel
into the application space as in the exo-kernels work.
(Hint: it has to do with safety.) What ideas have been proposed for safely
externalizing parts of the kernel? How does the kernel ensure the safety
of the external extensions and what kinds of safety properties are examined?