CS 6210: Advanced Operating Systems
Spring
2009
CS 6210 (Advanced Operating Systems) is a graduate level course that covers in detail many advanced topics in operating system design and implementation. It starts with topics such as operating systems structuring, multithreading and synchronization and then moves on to systems issues in parallel and distributed computing systems. There is no textbook for this course. Rather, we will read and discuss a number of important research papers that have been published. For each paper that is covered in class, students are expected to gain a solid understanding of the problem that is addressed by the paper, and the solution proposed by the authors. While this syllabus only lists the papers covered in class, additional materials including background readings and optional readings can be found here.
10% class participation
35% projects
25% midterm
30% final
Note that a passing grade is required in each of the above components in
order to pass the class.
This course is project intensive and will have a sequence of four or five projects. Strong programming skills are absolutely essential for completing these projects. Students can either do the entire sequence of assigned projects or they can choose to define a project that fits more closely with their individual research goals as a replacement of the projects after the first one. The first project is to be completed individually by everyone.
Each student in the class will have access to Clusters and IHPCW machines. Additional information regarding project resources will be provided on the class mailing list and with individual project assignments. If you have no COC account, please apply as soon as possible.
Optional supplementary reference texts include the following:
Note: the following paper list is tentative and subject to change in the next week or so. Please consider this before printing a large number of papers, especially those from later in this schedule. Also a list including all necessary background readings and optional readings can be found here.
1. Dawson R. Engler, Frans Kaashoek and James O'Toole, "Exokernel: An Operating System Architecture for Application-Level Resource Management ", Proceedings of the 15th ACM Symposium on Operating System Principles, ACM, December 1995. (slides)
2. Brian Bershad et al., " Extensibility, Safety and Performance in the SPIN Operating System ", Proceedings of the 15th ACM Symposium on Operating System Principles, December 1995. (slides)
3. J. Liedtke, " On Micro-Kernel Construction ", Proceedings of the 15th ACM Symposium on Operating System Principles, ACM, December 1995. (slides)
4. Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield, "Xen and the Art of Virtualization ", SOSP 2003. (slides, slides on Xen 3.0)
1. Mellor-Crummey, J. M. and Scott, M., "Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors ", ACM Transactions on Computer Systems, Feb. 1991. (slides)
2. Partial reading: Paul E. Mckenney and John D. Slingwine. Read-Copy Update: Using Execution History to Solve Concurrency Problems, Parallel and Distributed Computing and Systems, Oct 1998 (useful slides)
3. Ben Gamsa, Orran Krieger, Jonathan Appavoo, and Michael Stumm, Tornado: Maximizing Locality and Concurrency in a Shared Memory Multiprocessor Operating System , 1999 Symposium on Operating System Design and Implementation. (slides)
4. Kinshuk Govil, Dan Teodosiu, Yongqiang Huang, and Mendel Rosenblum. Cellular Disco: resource management using virtual clusters on shared-memory multiprocessors. In Proceedings of 17th Symposium on Operating Systems Principles, 1999 (slides)
5. Alexandra Fedorova, Margo Seltzer, Christopher Small and Daniel Nussbaum. Performance of Multithreaded Chip Multiprocessors And Implications For Operating System Design. Usenix 05
6. Partial reading: M.S. Squillante and E.D. Lazowska, " Using Processor-Cache Affinity Information in Shared Memory Multiprocessor Scheduling ", IEEE Transactions on Parallel and Distributed Systems, Feb. 1993, pgs. 131-143.
6. Ripal Nathuji and Karsten Schwan. " Virtual Power: Coordinated Power Management in Virtualized Enterprise Systems ", Symposium on Operating Systems Principles (SOSP), Oct. 2007.
Partial reading : Schroeder, M., and Burrows, M., " Performance of the Firefly RPC ", Proceedings of the Twelfth ACM Symposium on Operating Systems Principles, pgs. 83-90, December 1989.
1. B. N. Bershad, T. E. Anderson, E. D. Lazowska, and H. M. Levy. Lightweight Remote Procedure Call . ACM Transactions on Computer Systems, 8(1):37--55, Feb. 1990. (slides)
2. C.A. Thekkath and H.M. Levy, " Limits to Low-Latency Communications on High-Speed Networks ", ACM Transactions on Computer Systems, May 1993. (slides)
3. Hutchinson N.C., Peterson, L.L., " The x-Kernel: An Architecture for Implementing Network Protocols ", IEEE Transactions on Software Engineering, 17, 1, pgs. 64-76, January 1991. (slides)
4. David Wetherall, " Active Networks: Vision and Reality: Lessons from a Capsule-based System ", 17th ACM Symposium on Operating System Principles, OS Review, Volume 33, Number 5, Dec. 1999. (slides)
5. Liu, Kreitz, van Renesse, Hickey, Hayden, Birman, Constable, "Building Reliable High Performance Communication Systems from Components ", 17th ACM Symposium on Operating System Principles, OS Review, Volume 33, Number 5, Dec. 1999. (slides)
Date: TBD, In Class
Here are some example midterms:
1. Lamport, L., " Time, Clocks, and the Ordering of Events in a Distributed System ", Communications of the ACM, 21, 7, pgs. 558-565, July 1978. (slides)
1. Mahadev Satyanarayanan, "Coda: A Highly Available File System for a Distributed Workstation Environment", IEEE Trans. Computers, vol 39, no 4, Apr 1990 (slides)
2. Anderson, T. et all., " Serverless Network File System ", ACM Transpaction on Computer Systems, February 1996. (slides)
3. Feeley, Morgan, Pighin, Karlin, Levy, Thekkath,, "Implementing Global Memory Management in a Workstation Cluster ", Fifteenth ACM Symposium on Operating System Principles, Dec. 1995. (slides)
4. C. Amza, A. Cox, S Dwarkadas, P Keleher, H Lu, R. Rajamony, W. Yu and W. Zwaenepoel, " TreadMarks: Shared Memory Computing on Networks of Workstations " IEEE Computer, February, 1996. (slides)
1. R. Haskin et. al., " Recovery Management in QuickSilver ", ACM Transactions on Computer Systems, February 1988. (slides)
2. Satyanarayanan, M., et al., " Lightweight Recoverable Virtual Memory ", The Proceedings of Fourteenth ACM Symposium on Operating System Principles, pgs. 146-160, December 1993. (slides)
3. J. N. Gray, P. McJones, M. W. Blasgen, R. A. Lorie, T. G. Price, G. R. Putzolu, and I. L. Traiger. " The Recovery Manager of a Data Management System ", ACM Computing Surveys, Vol. 13, No. 2, June 1981, pp. 223-242. (slides)
1. Nov. 24 Cohen, E., and Jefferson, D., " Protection in the HYDRA Operating System ", Proceedings of Fifth ACM Symposium on Operating System Principles, pgs. 141-160, 1975. (slides)
2. Mitchell, J. G., et al., " An Overview of the Spring System ", Proceedings of Compcon, Feb. 1994.
3. Hamilton, G., Powell, M.L., and Mitchell, J.J., "Subcontract: A Flexible Base for Distributed Programming ", Proceedings of the Fourteenth ACM SOSP, pgs. 69-79, December 1993. (slides)
4. Wollrath, A., Riggs, R., and Waldo, J., "A Distributed Object Model for the Java System ", Usenix Conference on Object Oriented Technologies and Systems, May 1996. (slides)
5. Emmanuel Cecchet, Julie Marguerite, Willy Zwaenepoel, "Performance and Scalability of EJB Applications", Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. (slides)
1. Helen J. Wang, Xiaofeng Fan, Jon Howell, Collin Jackson, " Protection and Communication Abstractions for Web Browsers in MashupOS ", ACM Symposium on Operating System Principles, 2007.
2. Luis Andre Barroso, Jeffrey Dean, Urs Holzle, " Web Search for a Planet: The Google Cluster Architecture ", IEEE Micro.
3. Eric A. Brewer, " Lessons from Giant-Scale Services ", IEEE Internet Computing
4. Armando Fox, Steven Gribble, Yatin Chawathe, Eric Brewer, and Paul Gauthier, " Cluster-based Scalable Network Services ", Sixteenth ACM Symposium on Operating System Principles, Oct. 1997.
5. Saito, Bershad, Levy, " Manageability, Availability, and Performance in Porcupine: A Highly Scalable Cluster-based Mail Service ", 17th ACM Symposium on Operating System Principles, OS Review, Volume 33, Number 5, Dec. 1999. (slides)
6. Shahabi, Zimmermann, Fu, and Yao. " Reference Yima: A Second-Generation Continuous Media Server ", IEEE Computer Magazine, June 2002.
7. Dec. 1 Ashvin Goel, Luca Abeni, Charles Krasic, Jim Snow, Jonathan Walpole, Supporting Time-Sensitive Applications on a Commodity OS (slides) OSDI 2002
There will be a lab session. Date to be announced.
Date: TBD, In Class
Example finals: