CS 3210 Operating System Design

Spring 2003, TuTh 1:35-2:25

 

Final Exam Study Guide

 

Topics Covered:

·        Accessing Files (ch 15) – 2 questions

·        Swapping (ch 16) – 2 questions

·        Ext2/Ext3 (ch 17) – 2 question

·        Networking (ch 18) – 2 questions

·        Process Communication (ch 19) – 2 questions

·        Program Execution (ch 20) – 2 question

 

Exam is closed-book. There will be 12 questions in 2 hours and 50 minutes with short answers (usually 2 or 3 well chosen sentences). I expect knowledge of general concepts, important ideas and some details. Questions are drawn mostly from the primary text. I highlight important info in class but may draw a detail question from something described in the text but mentioned in passing in class

 

I recommend using a top-down study strategy. Read the chapter, close the book and ask yourself, “What were the 3-5 most important ideas or concepts presented in this chapter?” A “brainstorming” approach is also useful. I do this in class as review sometimes. “Tell me everything you know about interrupts and exceptions.” Just list everything you know and then organize among the major topics you listed above.

 

I often try to highlight or emphasize things that are particularly interesting or unusual or not well-known. In addition, I try to relate implementation details in Linux to corresponding general principles in OS (from the Dinosaur book).

 

Another technique is to skim the chapter, looking at headings, subheadings, and so forth. You should probably know something about each section and subsection. As a random example, consider “IRQ distribution in multiprocessor systems” on page 136. Why is this an issue? Because interrupts are executed (usually) by a single processor. So there is a policy issue that arises. Which processor should respond? How is this resolved in Linux? You should at least understand the basic SMP interrupt architecture using APICs. There is a “front-end APIC” and a per-cpu “local APIC” that share a communication bus. At a further level of detail you might recall that Intel hardware has a way to direct interrupts to the “lowest priority” task but that Linux simply uses the “round-robin” arbitration circuitry present in the hardware to handle situations in which two or more tasks have the same priority.

 

Accessing Files

          VFS interaction: file-system-specific read(), write() methods

        page-based, page-cache

                generic_file_read()

        writing more complex->  file size might change, need to release blocks

                generic_file_write()

        reading

                do_generic_file_read()

                        get address space, inode

                        figure out page

                        read pages (loop)

                        check page cache

                        readpage() – blocks waiting for data

                                ext2_readpage() – wrapper for generic block reads (of page)

                                        parameterized by function for page to block mapping (inode-block map)

                                blkdev_readpage() – device file

                                        page to block mapping trivial               

                        copy to user space

                        update access times, etc.

        read-ahead

                why, advantages

                heuristic to turn on/off

                knob to control amount of read-ahead (window)

                sequential” – in the window

                read-ahead group

                        subset of window       

                        last set of pages requested via read-ahead

                case analysis for where read falls (e.g. in window, outside group, etc.)

                /proc filesystem control on parameters

        writing

                basically: find blocks, copy data into page cache, mark dirty

                        acquire inode semaphore for atomicity

                        check permissions (mode), resource limits, update times

                        copy to page cache

                        wait for flush if O_SYNC

                        prepare_write

                                block resolution (deal with holes)

                                * read block into page cache if partial and not present!

                        commit_write

                                mark dirty

        memory mapping

                be familiar with data structure diagram (Fig 15-4)

                understand private mappings and how they use COW

                understand how mmap() manipulates VMAs

        direct (raw) i/o

                specify by opening file with O_DIRECT

                bypass block-device subsystem (caching, read-ahead, etc.)

                still need kernel to setup DMA, handle interrupts

                direct device-to-user transfer

                        pages must be locked during transfer

                kiobuf: direct access buffer

                        track pages and buffers

                        transfers in chunks up to 512KB

                must avoid concurrent raw and normal access

               

 

Swapping

          what is it? why? advantages

                virtualizes memory, expands address space

        swapping vs. disk cache

        lots of stuff, not just page replacement

                swap area management

        several questions

                what is swappable?

                swapping versus paging out (to mapped file)

                paging out versus releasing (clean) file mapped page

                when to swap?

                which process? which page?

                where to put page (which swap area? which slot?)

        what’s the “victim” selection algorithm?

                simple LRU approximation -> second chance, clock hand algorithm

        swap areas

                disk, partition, file

                priorities by speed

                slots

                swap area descriptor, data structures

                where is swapped out location stored? pte

                activating (easy), deactivating (hard – may fail)

                slot allocation algorithm

                        clusters, scan algorithm

                swap area looks like a file mapping (has its own address space object)

        swap cache

                staging area for swapped out/in pages

                swap selects pages, transfer controlled by separate mechanism

                cache allows tracking of shared pages

                page swapped out individually from address space

                        removed from memory only when last address space swaps

                        data structure limitation->2.6 reverse page mapping

        swap out

                scan address spaces for eligible pages

                looks for goal set (32 pages)

                        keep scanning until then, resume at same place next time

                try_to_swap_out()

                        accessed bit – if set, page is “young”, not swapped till next pass

                        add to swap cache (prepare to move out)

                        flush tlb entry

                        allocate slot

                does not activate i/o

        swap in

                check swap cache

                performs read-ahead (8 pages)

                blocking

                update pte, rss

        reclaiming page frames

                memory pressure is really bad

                lots of possibilities

                        free up unused cache structures, ro cache data

                        write dirty cache data

                        swap

                        kill process! oom_killer()

                activated by failure of a memory request (page allocator)

                try_to_free_pages(#of pages, priority)

                        makes several passes with increasing urgency

                        utilizes two LRU lists of pages (active, inactive)

                                periodically balanced; about 2/3 active

                        only scans active

                        uses 2 bits to maintain state (PG_active, PG_referenced)

                                a little state machine Fig 16-4

                        software speeds (not like MMU)

                                ops like reading, loading, swapping update bits

                        shrink_caches()

                kswapd – periodic swap out

                        why? some memory allocations can’t block (interrupt handlers)

                        can’t wait for swap out

                        if only interrupt i/o (unlikely) system could freeze

       

               

       

Ext2/Ext3

          good example of standard, solid fs

        characteristics, features

                selectable block size

                inode count at creation time

                block “groups” to enhance locality

                preallocates block chunks (8)

                careful write ordering

                fancy fsck

                immutable, append-only files (at mount)

        planned enhancements

                block fragmentation

                acl

                encryption, compression

                logical delete

                journaling

        data structures

                partition layout: bootblock, n block groups

                block group: super block, block group descriptors, data, inode bitmaps, inodes, data

                understand how sizes are calculated

                on-disk versus in-memory data structures

                pointers” on disk (block numbers)

        large files?

                two limits: # of files, file size

                i_size is 32 bits (4 GB limit)

                hack to increase size on 64 bit architectures: steal acl field

        no inode field in on-disk inode; why?

        7 kernel known filetypes: regular, dir, socket, fifo, sym link, dev files

                understand how data pages are used for each

        directory data layout: variable length records, to delete zero inode, skip over

        bitmap caches: too big to fit all in memory, lru cache

        creating a filesystem (mke2fs)

        disk block allocation

                try to keep data files together (contiguous), in the same group as parent dir

                try to spread subdirs across block groups

        fragmentation?

                not a big problem, keep a block reserve, cluster

                frag utilities available but not very necessary

        file holes

        block preallocation

        ext3 and journaling

                filesystem consistency: metadata (mostly), sometimes (data)

                fsck became a problem (time) with big disks

                        walks entire fs, verifies ALL metadata

                        places lost block in /lost+found

                journaling uses db transaction technique

                        log what you will do before doing it

                                log ops or data?

                                metadata ops or actual file data?

                        log on separate high-speed disk

                        in theory doubles i/o but

                                log is append only, highly clustered

                                even subsequent i/o is “more organized”

                                not so slow

                        on failure, redo logged but incomplete transactions

                        discard partial log entry (last one only)

                ext3 – simple “add-on” to ext2

                        creates a log in a special ext2 file

                        allows three levels of logging (increasing guarantees, costs)

                                writeback – only metadata

                                ordered – write data to disk before metadata (default mode)

                                journal – full data and metadata

                uses journaling block device subsystem

                        can be used by other journaling mechanisms

                        complex; must protect itself from crashes while providing atomicity guarantees

                        log records, atomic operations, transactions

                        journal_start(), journal_stop()

       

         

Networking

       

         

IPC

         

Execution