CS4420 Database System Implementation (Spring 2005)


| General | Textbook | Administration | Description | Project | GT Calendar | Lecture Schedule | Related |


General Information

Instructor: Professor Ling Liu
Office:
CCB 216, Phone: 5-1139, Email:lingliu@cc.gatech.edu
Lecture location : Instruction Center 117
Lecture hours: Tuesday and Thursday 1:35pm -- 2:55pm (Jan 10 ~ May 07, 2005)
Office hours: Tuesdays and Thursdays 12:30pm - 1:30pm, or by appointment

 

Course TAs:  

Weiling Zhu(wzhu@cc.gatech.edu, CCB 226D), Office hour: Tuesday 3-5pm, or by appointment

James Caverlee(caverlee@cc.gatech.edu, CCB 225B), Office hour: Monday 1-3pm, or by appointment

 

Prerequisite(s): CS 4400

 

Newsgroup git.cc.class.cs4420
The newsgroup will be used to post class announcements, answer common questions, make corrections to project phase reports (if needed). Students are encouraged to conduct discussions about class material.

Course Evaluation online Click Here.


Text Book

Hector Garcia-Molina, Jeffrey D. Ullman, and Jennifer Widom, Database System Implementation, Prentice Hall, 2000.


Administrative Issues

Grading Policy:

Midterm 1

20%

6th Week:  Feb 17 Thursday

Midterm 2

20%

13th Week: March 31 Thursday

Project

40%

16th Week: April 29 Friday

Final

20%

May 2 (Monday) 11:30am-2:20pm

 

For each of the above, there will be a deadline for appealing your grade. No appeals will be considered after this deadline unless there is a medical or other excuse for absence during the appeal period.

Re-examination: No re-examination except as per regulations.

On-line Information: Course-related information, as well as the assignments, will be maintained on-line in a form that is accessible via the Web page for CS 4420. The URL for this page is: http://www.cc.gatech.edu/classes/AY2005/cs4420_spring/ . You are recommended to check this page regularly. The on-line availability of the notes will be announced in class, but there won't be any handouts.

Policy on Collaboration:

  • You are allowed to discuss the project with each other. Give credit to others for their ideas. But DO NOT merely copy the design idea and the code of others.
  • Midterms and final exam are close-book exams.

Course Description:

In this course we will study four major topics relating to database system implementation. The emphasis is on the ``systems'' components of a database management system. To better understand these components, a database implementation project will be required where you will build some of the basic ``system'' components for a simple database management system. We start with a brief review of relational database concepts and an overview of the basic components of a database system. The first major area of study deals with storage management. How data is stored (organized) on secondary storage plays an important role in processing database queries efficiently. We will examine the various file structure alternatives involving indexing and hashing. The second area deals with the query processing component of a relational database system. Here, we are interested in two topics: transformations which are applied to a user query to make it execute more efficiently and algorithms which implement various relational algebra operators efficiently. Both of these topics fall within the realm of the query optimizer. The third topic involves concurrency control. For instance, how can multiple transactions execute on a database and still see a consistent view of the data, as well as to leave the database in a consistent state. We examine several concurrency control schemes and their tradeoffs. The fourth area deals with the recovery manager of a database system. The main concern is how the database system recovers from a failure, e.g., a transaction failure, a system crash, etc. We examine the advantages and disadvantages of several recovery schemes. If time permits, we will discuss the various issues in database performance tuning and how parallel relational database systems can be used to improve the performance of query and transaction processing.

 

Topic

Chapter

Introduction to DBMS Implementation

1

Relational DB Review

2

Data Storage

3

Representing Data Elements

4

Index Structures

5

Query Execution

6

Query Compiler

7

Coping with System Failures

8

Concurrency Control

9

Transaction Management

10

Information Integration

11

Project (demo schedule | demo requirements)

This is a group project. Groups of 3-4 students are required. You will choose one of the two proposed projects listed below. You need to send an email no later than Feb. 3 (Thursday) to the instructor (lingliu@cc.gatech.edu) and TAs (caverlee@cc.gatech.edu, wzhu@cc.gatech.edu), including the following information about your group: 

  • project number 
  • the list of your group members (name, email)
  • the group contact person and his/her email

 

1. Project Description

Proposed Project Option 1: HTML or PDF

* Download B+ tree code (in C) and sample input for testing

Proposed Project Option 2: HTML or PDF

* Get related software

2. Project Resources:

 

C Tutorial 1 and C Tutorial 2

 

C++ Tutorial

 

Java Tutorials

 

3. Project Schedule

 

Phase

Begin

End 

Report Due Date

Phase I

1/20 (week2)

3/1 Tuesday (week 8)

3/3, Thur., Phase I report due by midnight (email to TAs)

Phase II

3/1 

4/28 (week16)

 

Project Demo

 

4/28 10am

4/29 5pm

Location: TBA, half an hour per group

4/29 project report and code due

 

4. Project Groups

 

5. Grading

 

Phase I (report): 35% of project grade (14 points of total grade)

Phase II (report+demo+coding): 65% of project grade (26 points of total grade)

Notes