CS 3210 Operating System Design

Spring 2003, TuTh 1:35-2:25

 

Exam 1 Study Guide

 

Topics Covered:

·        OS/Kernel Intro

·        Booting/Kernel Init (appendix A)

·        System Calls (ch 9)

·        Memory Addressing (ch 2)

·        Processes (ch 3)

·        Interrupts and Exceptions (ch 4)

·        Kernel Synch (ch 5)

·        Timing (ch 6)

·        Memory Management (ch 7)

 

Exam is closed-book. There will be 10 questions in 90 minutes with short answers (usually 2 or 3 well chosen sentences). I expect knowledge of general concepts, important ideas and some details. Questions are drawn mostly from the primary text. I highlight important info in class but may draw a detail question from something described in the text but mentioned in passing in class

 

I recommend using a top-down study strategy. Read the chapter, close the book and ask yourself, “What were the 3-5 most important ideas or concepts presented in this chapter?” A “brainstorming” approach is also useful. I do this in class as review sometimes. “Tell me everything you know about interrupts and exceptions.” Just list everything you know and then organize among the major topics you listed above.

 

I often try to highlight or emphasize things that are particularly interesting or unusual or not well-known. In addition, I try to relate implementation details in Linux to corresponding general principles in OS (from the Dinosaur book).

 

Another technique is to skim the chapter, looking at headings, subheadings, and so forth. You should probably know something about each section and subsection. As a random example, consider “IRQ distribution in multiprocessor systems” on page 136. Why is this an issue? Because interrupts are executed (usually) by a single processor. So there is a policy issue that arises. Which processor should respond? How is this resolved in Linux? You should at least understand the basic SMP interrupt architecture using APICs. There is a “front-end APIC” and a per-cpu “local APIC” that share a communication bus. At a further level of detail you might recall that Intel hardware has a way to direct interrupts to the “lowest priority” task but that Linux simply uses the “round-robin” arbitration circuitry present in the hardware to handle situations in which two or more tasks have the same priority.

 

OS/Kernel Intro

Booting/Kernel Init

System Calls

          user/kernel boundary

        dual-mode architecture

        mode bit, privileged instructions

        protected procedure call

        library wrapper routines vs sys call

        saving context

        cost of user/kernel transition

        user vs. kernel stack

        blocking system calls: slow vs fast

        trap” instructions: trap, swi, int 80h (Intel)

        sys call interrupt handler vs syscall “service routines” (do_xyz())

        system_call() entry point

        sys_call_table

        man 2 (list of sys calls)

        parameter passing issues

                sys call number

                copying params to/from user space

        “fix-up code” and “fix-up” tables

                lazily detecting invalid syscall (user) parameters

                need to distinguish kernel code exception (oops) from bad param

                lazy checking to keep transition fast

        things that happen on the way back out of system calls

                need_resched? -> schedule()

                signal delivery

                deferred processing (bottom halves, tasklets)

Memory Addressing

          segmentation vs paging

        logical -> linear -> physical

        segments historically

        why Linux minimizes use of segments

        Intel segmentation architecture

                segment registers contain segment selectors

                segment selectors are indices into a segment descriptor table

                segment descriptors contain “base & bounds”, permissions, etc.

                global vs local descriptor tables, gdtr, ldtr

                segment descriptor shadow registers

        Linux segmentation

                just gdt primarily

                9 general segments (overlapping)

                2 per process: task state descriptor, local descriptor table (ldt) descriptor

                per process ldt available for DOS emulation

                general segments: null, kernel code/data, user code/data, APM (4)

        Intel two-level paging

                4k, 4m pages

                        advantages of “jumbo pages”

                cr3 -> top-level page directory

                protection bits

                other bits

                why three-level paging?

                        reduces page table memory usage for large address spaces

                Intel PAE: 36 bit physical addresses

                caching, tlb

        Linux paging: three-level abstraction

                middle-level compiled out by macro tricks on Intel

        kernel layout in memory

                physical and virtual

        kernel mapped into high-virtual for each process

Processes

          basic ideas: process, user/kernel threads

        advantages of threading

        task_struct

                look at the code online; be familiar with a variety of fields

        process state in Linux

        pid vs. task_struct address (pid hash)

        task_struct allocated in 2 page kernel stack

                current optimization

                kernel overflow clobbers task_struct

        lists and list iterators

                task_list (all), run_queue

        process ancestry linkage (parents, siblings, etc.)

        a bit on wait queues and their implementation

        process resource limits

        process (context) switch

                saving context (Intel hardware, task state segment)

                thread context data structure

                switch_to() // very tricky and architecture dependent!

                lazy saving of floating-point registers

        clone, fork, exit

        kernel threads (no user-space mapping, shared page table)

                pid 0 (idle, aka “swapper”)

                pid 1 (init)

                kevent, kapm, kswapd, kflushd, kupdated, ksoftirqd

Interrupts and Exceptions

          getting cpu’s attention

        synch (exceptions, errors) vs. asynch (interrupts, usually devices)

        hardware or software initiated

        top-half + bottom half (deferred, optional)

        some issues:

                maskable vs. non-maskable

                restart?

        general architecture: DEVICE -> PIC -> CPU

                devices interrupt on IRQs

                IRQs translated to vectors (IDT indices)

                can be disabled at CPU or PIC (individual)

                pending interrupts not lost

        APIC

                fancy PIC for interrupt routing

                includes clock, timer, etc.

                implements IPI (interprocessor interrupts)

        exceptions

                about 20, architecture dependent

                usually just translated to signal sent to process

                page fault is different: complex case analysis

        IDT

        different types of “gates”

        nested execution of interrupts and exceptions

        IRQ data structures

                irq_desc – array of irq_desc_t

                irq_desc_t – pointer to PIC “handler”, flags, lock, pointer to list of irqactions

                irqaction – handler (actual service routine), flags, name

        IRQ sharing issues

        bottom halves (old), tasklets (better), softirqs (most general)

                mechanisms for deferred processing

                differ in the amount of concurrency (& complexity of handler) available

                use tasklets generally

                executed on completion of last pending interrupt, transition to user, etc.

        return from interrupts, exceptions, system calls, etc.

                common assembly code with different entry points

                iret instruction

                schedule, signal delivery, tracing, mode transition, stack switch, etc.

Kernel Synchronization

          the problem: race conditions

        the solution: critical sections (mutexes)

        how can concurrency happen in the kernel?

                interrupts, exceptions, blocked system calls, smp

        kernel pre-emption:

                2.4 is non-preemptive (simplifying assumption)

                creates a certain scheduler latency

                2.5 is kernel pre-emptive (more need for explicit synch)

        synchronization hierarchy

                atomic ops (various, test-and-set, read-modify-write)

                memory barriers (r/w)

                disable interrupts (local or global)

                spin locks (r/w)

                semaphores (r/w)

        when you don’t need explicit synch

        atomic ops (i++ not necessarily atomic!)

                Intel lock prefix, etc.

        why memory barriers?

                on smp machines: memory access reordering

                        by compiler

                        by pipeline

                things can appear to occur out of order

                need to enforce ordering for synch

        interrupt disabling

                historically most common

                really need to save/restore (nesting)

                global is bad (deprecated)

                doesn’t protect from SMP access

        spin lock (SMP primitive)

                very short hold times

                usually disable and spin

                per data structure locks (lock granularity)

                r/w

                big reader spinlocks (to reduce cache ping pong)

        semaphores (blocking primitive)

                difference between user and kernel primitives

                implementation: count and queue, uses spinlocks and disables

                complex implementation for efficiency

        [completions: not covered]

        examples of situations requiring synchronization

Timing

          why time is important to os

        Intel hardware: RTC, PIT, TSC, APIC timer

        capabilities of each

        role of each in Linux

                xtime

                jiffies

        sys calls

                gettimeofday

                adjtimex

                setitimer, alarm

        timer interrupt handler (PIT handler)

                top-half

                bottom-half

        things that need doing

                increment jiffies

                update xtime

                check quantum expiration

                check resource limits

                update load averages, stats

                execute expiring kernel timers

                kernel profiling

                NMI watchdog timer (cool idea on SMP systems)

        kernel dynamic timers

                need for lots of high efficiency timers

                simple implementation idea not efficient

                complex data structure

                array of lists of timers expiring in next 256 jiffies

                need to “replenish” on wrap-around

                keep another array of lists of timers (higher granularity)

                actually there are five lists

                cascading replenishes are possible but very rare

Memory Management

          memory allocation overview

        how malloc works

        why kernel allocation is critical

        everything but kernel code & static data comes from allocator

                process pages, descriptors, etc.

        page allocator (buddy system)

        on top is slab allocator (descriptor specific)

        on top is kmalloc (general purpose)

        vmalloc uses page allocator + slab

                physically discontiguous allocations (uses page table tricks)

        zones

        nodes

        buddy system

                power of 2 allocator”

                10 lists

                size not available? look at next largest, split if necessary

                merge “buddies” on free when possible

                buddies: same size, contiguous, properly aligned

                        consider 256k chunks in 1m

                        chunks at 0 and 256 are buddies

                        chunks at 256 and 512 are not

                data structure

                aggressive vs. lazy merging