CS 4420/8803DSI Database Systems Implementation, Spring 2008

 

Prof:Leo Mark (leomark@cc.gatech.edu)

Office: KLAUS 3324
Lectures: MWF 11am – 12pm, KLAUS 1447
Office Hours: MWF 12 – 1pm.

TA:
Minh Quoc Nguyen (quocminh@cc.gatech.edu)
Office: KLAUS 3319
Office Hours: Mon Tues 12pm – 1:30pm

Raghavendra (raghavendra.tk@gatech.edu)
Office: KLAUS 3319
Office Hours: Thu Fri 12pm – 1:30pm

TEXT: Database System Implementation

Garcia-Molina, Ullman & Widom, Prentice Hall, 2000

PAPERS: Selected papers

NEWSGROUP:git.cc.class.cs4420

SWIKI:http://swiki.cc.gatech.edu/cs4420

This forum will be used to post the class notes as well as any resource you find useful for the class. Feel free to make use of this forum for knowledge sharing with entire class.

ANNOUNCEMENTS:

 Check this section of webpage often for updates and information. Latest updates [04/14/08]

[1/29/08] The detailed requirement specifications for Phase I are uploaded. click here.

[1/29/08] BTree Source code (JAVA). Phase I note.

[1/29/08] Project Description is uploaded.

[2/01/08] The office of TA Raghavendra changed to room 3319

[2/08/08] Presentation for Phase I Requirements.

[2/08/08] B+-tree code (C).

[2/08/08] Skeleton for B+-tree code (read this before using the B+-codes) Java C

[2/09/08] More information about Phase I will be posted by Monday, Feb 11th. Please check frequently. Check Swiki for more detail.

[2/10/08] Storage Manager guideline A data file example

[2/19/08] The deadline for phase I will be extended to Feb 29th.

[2/19/08] The exam will be postponed until March 7th.

[03/02/08] Archival Metadata

[03/02/08] Demo Appointment

[03/02/08] Minh Quoc Nguyen will hold office hours on Friday, March 7th instead of Monday, March 3rd.

[03/03/08] A script file (note) for Phase I demo

[03/23/08] Phase II implementation guideline (v.2)

[04/07/08] Quiz II will be on April 18th.

[04/07/08] Phase II Demo: Phase II Demo Guideline Demo Time: Check Swiki for more detail.

[04/07/08] Phase III: Implementation Guideline.

[04/14/08] TA Minh will hold TA office hours on Friday (4/18) instead of Monday (4/14).

COURSE CONTENT

In this course we will study four major topics relating to database system implementation. The emphasis is on the ``systems'' components of a database management system. To better understand these components, a database implementation project will be required where you will build some of the basic ``system'' components for a simple database management system. We start with a brief review of relational database concepts and an overview of the basic components of a database system. The first major area of study deals with storage management. How data is stored (organized) on secondary storage plays an important role in processing database queries efficiently. We will examine the various file structure alternatives involving indexing and hashing. The second area deals with the query processing component of a relational database system. Here, we are interested in two topics: transformations which are applied to a user query to make it execute more efficiently and algorithms which implement various relational algebra operators efficiently. Both of these topics fall within the realm of the query optimizer. The third topic involves concurrency control. For instance, how can multiple transactions execute on a database and still see a consistent view of the data, as well as to leave the database in a consistent state. We examine several concurrency control schemes and their tradeoffs. The fourth area deals with the recovery manager of a database system. The main concern is how the database system recovers from a failure, e.g., a transaction failure, a system crash, etc. We examine the advantages and disadvantages of several recovery schemes. If time permits, we will discuss the various issues in database performance tuning and how parallel relational database systems can be used to improve the performance of query and transaction processing.

PROJECT DESCRIPTION:Uploaded

Topic Chapter

Introduction to DBMS Implementation 1

Relational DB Review 2

Data Storage 3

Representing Data Elements 4

Index Structures 5

Query Execution 6

Query Compiler 7

Coping with System Failures 8

Concurrency Control 9

More about Transaction Management 10

Archival Selected papers

Metadata Management Selected papers

Temporal Databases Selected papers

Incremental Computation Selected papers

GRADING

UGRADS: 2 Exams – 20 points each, Final – 20 points, Project – 40 points (total 100)

GRADS: : 2 Exams – 20 points each, Final – 20 points, Project – 60 points (total 120)

IMPORTANT DATES:

QUIZ DATE

Quiz I Feb 29 (postponed until Mar 7th)
Quiz II April 11 (postponed until Apr 18th)

PROJECT SCHEDULE

PHASE I Feb 22 (extended to Feb 29th)
PHASE II April 4
PHASE III April 23
DEMO April 24-25