CS 3210 Operating System Design
Spring 2003, TuTh
Final Exam Study Guide
Topics Covered:
·
Accessing Files (ch 15) – 2 questions
·
Swapping (ch
16) – 2 questions
·
Ext2/Ext3 (ch
17) – 2 question
·
Networking (ch
18) – 2 questions
·
Process Communication (ch 19) – 2 questions
·
Program Execution (ch 20) – 2 question
Exam is closed-book. There will be 12 questions in 2
hours and 50 minutes with short answers (usually 2 or 3 well chosen sentences).
I expect knowledge of general concepts, important ideas and some details.
Questions are drawn mostly from the primary text. I highlight important info in
class but may draw a detail question from something described in the text but
mentioned in passing in class
I recommend using a top-down study strategy. Read
the chapter, close the book and ask yourself, “What were the 3-5 most important
ideas or concepts presented in this chapter?” A “brainstorming” approach is
also useful. I do this in class as review sometimes. “Tell me everything you
know about interrupts and exceptions.” Just list everything you know and then
organize among the major topics you listed above.
I often try to highlight or emphasize things that
are particularly interesting or unusual or not well-known. In addition, I try
to relate implementation details in Linux to corresponding general principles
in OS (from the Dinosaur book).
Another technique is to skim the chapter, looking
at headings, subheadings, and so forth. You should probably know something
about each section and subsection. As a random example, consider “IRQ
distribution in multiprocessor systems” on page 136. Why is this
an issue? Because interrupts are executed (usually) by
a single processor. So there is a policy issue that arises. Which
processor should respond? How is this resolved in Linux? You should at least
understand the basic SMP interrupt architecture using APICs. There is a
“front-end APIC” and a per-cpu “local APIC” that share a communication bus. At
a further level of detail you might recall that Intel hardware has a way to
direct interrupts to the “lowest priority” task but that Linux simply uses the
“round-robin” arbitration circuitry present in the hardware to handle
situations in which two or more tasks have the same priority.
Accessing Files
VFS
interaction: file-system-specific read(), write()
methods
page-based, page-cache
generic_file_read()
writing more complex->
file size might change, need to release blocks
generic_file_write()
reading
do_generic_file_read()
get address space, inode
figure out page
read pages (loop)
check page cache
readpage()
– blocks waiting for data
ext2_readpage() – wrapper for generic block reads (of page)
parameterized by function for page to block mapping (inode-block map)
blkdev_readpage()
– device file
page to block mapping trivial
copy to user space
update access times, etc.
read-ahead
why, advantages
heuristic to turn on/off
knob to control amount of read-ahead (window)
“sequential” – in the window
read-ahead group
subset of window
last set of pages requested via read-ahead
case analysis for where read falls (e.g. in window, outside
group, etc.)
/proc
filesystem control on parameters
writing
basically: find blocks, copy data into page cache, mark
dirty
acquire inode semaphore for
atomicity
check permissions (mode), resource limits, update times
copy to page cache
wait for flush if O_SYNC
prepare_write
block resolution (deal with holes)
*
read block into page cache if partial and not present!
commit_write
mark dirty
memory mapping
be familiar with data structure diagram (Fig 15-4)
understand private mappings and how they use COW
understand how mmap() manipulates VMAs
direct (raw) i/o
specify by opening file with O_DIRECT
bypass block-device subsystem (caching, read-ahead, etc.)
still need kernel to setup DMA, handle interrupts
direct device-to-user transfer
pages must be locked during transfer
kiobuf: direct access buffer
track pages and buffers
transfers in chunks up to 512KB
must avoid concurrent raw and normal access
Swapping
what is it? why? advantages
virtualizes memory, expands address space
swapping vs. disk cache
lots of stuff, not just page replacement
swap area management
several questions
what is swappable?
swapping versus paging out (to mapped file)
paging out versus releasing (clean) file mapped page
when to swap?
which process? which page?
where to put page (which swap area? which slot?)
what’s the “victim” selection algorithm?
simple LRU approximation -> second chance, clock hand
algorithm
swap areas
disk, partition, file
priorities by speed
slots
swap area descriptor, data structures
where is swapped out location stored? pte
activating (easy), deactivating (hard – may fail)
slot allocation algorithm
clusters, scan algorithm
swap area looks like a file mapping (has its own address
space object)
swap cache
staging area for swapped out/in pages
swap selects pages, transfer controlled by separate mechanism
cache allows tracking of shared pages
page swapped out individually from address space
removed from memory only when last address space swaps
data structure limitation->2.6 reverse page mapping
swap out
scan address spaces for eligible pages
looks for goal set (32 pages)
keep scanning until then, resume at same place next time
try_to_swap_out()
accessed bit – if set, page is “young”, not swapped till
next pass
add to swap cache (prepare to move out)
flush tlb entry
allocate slot
does not activate i/o
swap in
check swap cache
performs read-ahead (8 pages)
blocking
update pte, rss
reclaiming page frames
memory pressure is really bad
lots of possibilities
free up unused cache structures, ro
cache data
write
dirty cache data
swap
kill process! oom_killer()
activated by failure of a memory request (page allocator)
try_to_free_pages(#of pages, priority)
makes several passes with increasing urgency
utilizes two LRU lists of pages (active, inactive)
periodically balanced; about 2/3 active
only scans active
uses 2 bits to maintain state (PG_active,
PG_referenced)
a little state machine Fig 16-4
software speeds (not like MMU)
ops like reading, loading, swapping update bits
shrink_caches()
kswapd – periodic swap out
why? some memory allocations can’t
block (interrupt handlers)
can’t wait for swap out
if only interrupt i/o (unlikely) system could freeze
Ext2/Ext3
good example of standard, solid fs
characteristics,
features
selectable
block size
inode count at creation time
block
“groups” to enhance locality
preallocates block chunks (8)
careful
write ordering
fancy fsck
immutable,
append-only files (at mount)
planned
enhancements
block fragmentation
acl
encryption,
compression
logical
delete
journaling
data structures
partition
layout: bootblock, n block groups
block
group: super block, block group descriptors, data, inode
bitmaps, inodes, data
understand
how sizes are calculated
on-disk
versus in-memory data structures
“pointers”
on disk (block numbers)
large files?
two
limits: # of files, file size
i_size is 32 bits (4 GB limit)
hack to
increase size on 64 bit architectures: steal acl
field
no inode field in on-disk inode;
why?
7 kernel known filetypes:
regular, dir, socket, fifo,
sym link, dev files
understand
how data pages are used for each
directory data
layout: variable length records, to delete zero inode,
skip over
bitmap caches:
too big to fit all in memory, lru cache
creating a
filesystem (mke2fs)
disk block
allocation
try to
keep data files together (contiguous), in the same group as parent dir
try to
spread subdirs across block groups
fragmentation?
not a
big problem, keep a block reserve, cluster
frag utilities available but not very necessary
file holes
block preallocation
ext3 and
journaling
filesystem
consistency: metadata (mostly), sometimes (data)
fsck became a problem (time) with big disks
walks
entire fs, verifies ALL metadata
places
lost block in /lost+found
journaling
uses db transaction technique
log
what you will do before doing it
log ops or data?
metadata ops or actual file data?
log
on separate high-speed disk
in
theory doubles i/o but
log is append only, highly clustered
even subsequent i/o is “more organized”
not so slow
on
failure, redo logged but incomplete transactions
discard
partial log entry (last one only)
ext3 –
simple “add-on” to ext2
creates
a log in a special ext2 file
allows
three levels of logging (increasing guarantees, costs)
writeback – only metadata
ordered – write data to disk before metadata (default mode)
journal – full data and metadata
uses
journaling block device subsystem
can
be used by other journaling mechanisms
complex;
must protect itself from crashes while providing atomicity guarantees
log
records, atomic operations, transactions
journal_start(), journal_stop()
Networking
IPC
Execution