Course Info

The course is a combination of lectures and programming assignments in which we will study the internals of modern database management systems. It will cover the core concepts and fundamentals of the components that are used in both high-performance transaction processing systems (OLTP) and large-scale analytical systems (OLAP). The class will stress both efficiency and correctness of the implementation of these ideas. The course is appropriate for advanced undergraduate and graduate students interested in systems programming.

  • Instructor: Joy Arulraj
  • Time: Mon/Wed 3:30 – 4:45 PM
  • Location: Instructional Center 211
  • Discussion platform: Piazza
  • Grading platform: Gradescope
  • TAs: Gaurav Tarlok Kakkar

Syllabus

Format

The course is a combination of lectures and programming assignments in which we will study the internals of modern database management systems.

Prerequisites

Students are expected to have completed three undergraduate-level computer systems courses:

  • Data structures and Algorithms (CS 1332) (strict)
  • Computer Systems and Networks (CS 2200) (strict)
  • Design Operating Systems (CS 3210) (recommended)
  • Introduction to Database Systems (CS 4400) (recommended)
  • Database System Implementation -- Part I (CS 4420/6422) (recommended)

and to be comfortable with programming in Java or C++. The course is open to both graduate and advanced undergraduate students.

Academic Honesty

All students should adhere to the Georgia Tech Honor Code. University Policies will be followed strictly in this course. Please, pay particular attention to academic misconduct.

Educational Objectives

This is the second part of a two-part series of courses on the design and implementation of database management systems. This course has a heavy emphasis on programming assignments. There will be two exams. Upon successful completion of this course, the student should be able to:

  • Understand and apply state-of-the-art implementation techniques for database management systems following modern coding practices.
  • Identify trade-offs among database systems techniques and contrast alternatives for both on-line transaction processing and on-line analytical workloads.
  • Develop and justify design decisions in the context of a high-performance database system.
  • Implement and evaluate complex, scalable components of database systems, with emphasis on providing experimental evidence for design decisions.

This course will be mostly self-contained. We will cover the following topics in the second part of the series:

  • Query Compilation, Vectorization
  • Concurrency Control
  • Logging and Recovery Methods
  • Query Optimization
  • Leveraging Modern Hardware

Grading Scheme

The final grade for the course will be tentatively based on the following weights:

  • 45% Programming Assignments
  • 15% Exercise Sheets
  • 20% Mid-term Exam
  • 20% Project

Programming Assignments

The programming assignments are geared towards exploring the topics covered in the lectures. We will be using an end-to-end toy relational database management system for this course. This system has been developed for educational purposes and should not be used in production.

Exercise Sheets

The exercise sheets consist of a set of subjective problems on the topics covered in the lectures. They are representative of the questions that will appear in the exams.

Exams

There will be two remote exams as specified in the schedule.

The exams will be a combination of written and multiple-choice questions based on the topics discussed in class. It will be open notes.

Please advise me of any conflicts with these likely exam dates before the end of the second week of classes.

Textbook

Credits

This website is based on a design by Andy Pavlo.