SYLLABUS
CS 6400 DATABASE SYSTEMS CONCEPTS AND
DESIGN
(Spring 2005- Navathe)
Objective: The objective of this course is to give an advanced introduction to the concepts for modeling, designing, querying and managing large databases. The course covers a spectrum of topics involved with current approaches to modeling and design of databases and the design of DBMSs to manage databases. The relational model is emphasized and relational database management systems are addressed from the standpoint of query optimization, database security, transaction management, concurrency control, and recovery. Other topics to be introduced will include object-oriented databases, distributed databases, data warehousing and mining. Some topics like advanced data models (Ch 24), ODMG and object-relational database management (Ch. 21,22) , XML (Ch 26) are presently not intended to be covered. This course is an overview course that is followed by specialized database courses: CS6411: O-O database models and systems, CS 6421: active and dynamic DBMSs, CS6430: parallel and distributed database systems and applications. The latter three are in the process of being merged into an Advanced Database Management course tentatively to be offered in Fall 2004.
A basic knowledge of programming languages, files, and the application development process is assumed. A first course in database management covering introduction to relational, hierarchical and network databases is useful but not a must. Those with sufficient undergraduate database coursework are advised to go straight to one of the advanced courses. Students looking for a hands-on application development experience should try to take CS4400 (which is not so easy to get into!). Students who have had the CS4400 course already are recommended not to take this course for credit due to the substantial overlap in the first half of the course. Those looking for internals of a DBMS should get into CS4420.
Instructor: Prof. Sham Navathe Office: CRB 259
Phone: 894-0537 Office Hours: after class in CoC
e-mail: sham@cc other- by appointment.
Class time: Tues-Thurs. 8:00-9:30 Secretary: Dani Denton in CoC
Rm 252, (denton@cc, 5-4785)
Class room: Coll. of Comp. 101
Teaching Assistant: None
Textbook:
[EN] R. Elmasri and S.B. Navathe, Fundamentals of Database Systems , Addison Wesley, Edition 4, 2004.
Note: Edition 4 is a fairly revised version of edition 3 with new chapters 9 ( Advanced SQL) , 26 (XML), 29 (Emerging Technologies) and revised chapters 12 (Overview of the DB design process), 23 (DB Security), 27 (Data Mining) etc. The earlier chapters 1-15 (except 9 and 12) were rewritten with minor revisions. Chapters 15-19 are almost identical with edition 3. Chapters 10 on Oracle/Access and Chapter 25 on Deductive Databases from edition 3 have been taken out.
Reference Books (FYI only - Not Required):
1.[SKS] A. Silberschatz, H. Korth and S. Sudarshan, Database System Concepts, 4th edition,
Mcgraw Hill, 2001.
2. [R] R. Ramakrishnan, and Johannes Gehrke, Database
Management Systems, 3rd edition, WCB McGraw Hill, 2003.
3. [GUW] H. Garcia-Molina, J. D. Ullman, J. Widom, Database Systems: the Complete Book,
Prentica Hall, 2002. 4.[D] C.J.
Date, An Introduction to Database Systems, Vol.2, Addison Wesley, 1983
5. [M] D. Maier, The
Theory of Relational Databases, Computer Science Press, 1983.
6. [U] J.D.
Ullman, Principles of data and Knowledge Based Systems, Vol. 1 and
2, Computer Science Press, 1989.
7. [S] Readings in Database Systems, edited by Michael
Stonebraker, Morgan Kaufmann, Ed. 2, 1994.
(The above books may be found on reserve in the library under either Navathe, or Omiecinski, or Mark.)
Some papers may be assigned as additional readings.
Project or a research term paper:
Students will work on one of the two different deliverables: (a) a project, involving either the use of a DBMS, or an implementation of some aspect of a DBMS, or some innovative application of database concepts, to be done in teams of two or three, and (b) a research term paper to be done individually which goes beyond simply summarizing the published work in some area. [ note: without a TA this semester, the project teams will be essentially work by themselves. The instructor can possibly arrange access to the databases on campus , but will not be able to give much implementation guidance.]
Project:
Students are welcome to work in teams to propose a project of their own to develop some aspect of a database system, create some innovative application, or investigate or evaluate a database management system or tool. This is to give opportunity to the (especially non-CS) students who want to try some hands on database experience that this course does not otherwise offer. I am completely open about this and would like students to think of applications in their own areas of interest and see how current database technology-based solutions can be developed or suggest techniques and approaches that merit further investigation. This is your opportunity to investigate how DB technology may be applied to graphics, robotics, virtual reality, network management or any such applications. Students should hand in a proposal by February 17th stating (i) the goal, (ii) problem definition (iii) description of design, approach, or unique features (iv) what will be implemented, what will be written up and what will be demonstrated. The platform, the system, language etc. is totally flexible. Students are encouraged to interact with the other Database faculty and Ph.D. students so as to define a "meaningful yet small" project that integrates with any ongoing research in the database group. You may also consult the web pages for the database group under the category of research from the college of computing home page. Those with interest in pursuing the database field further are strongly encouraged to try to interact with on-going research activities and do a project that will lead to a further independent study (CS 8903).
Examples under A:
i) implementation of a specific file access or indexing scheme for a temporal database
ii) use of a DBMS to build an image database
iii) implementing ideas from one of the papers under Storage Management in [S]
iv) implementing a DBMS that simulates storing, accessing, and a limited querying of data in the ER model.
v) investigating the performance of some data mining algorithms
vi) a medical or bioinformatics application using a DBMS
vii) an engineering information system prototype application
viii) databases for virtual reality
ix) implementing some aspect of a genome or geographic data management system (see Ch 29).
x) Development of a web-based querying engine for unstructured data.
B. RESEARCH TERM PAPER: Hand in the proposed topic, the scope and focus of the paper and a preliminary set of references by Feb. 17th. The instructor will try to help you with further search for material. The term paper serves two goals: one- it allows "non-computer science " majors to apply database technology to their field of interest, two- it allows one to explore topics in which one could do further independent study, do a masters' project, or take an advanced DB course later. Please refrain from writing a general paper - say, on multimedia database management. That topic could be broken into: content-based retrieval from multimedia data, factors affecting quality of service in multimedia databases, efficient handing of stream data, applying data mining to images, etc.
A possible list of topics. ( This is only a suggested list and you will have to refine each topic.):
1. Distributed Databases. (Take some aspect: query processing, concurrency control, recovery, distribution design. see Ch. 25 in [EN]. Also see books by Ceri/Pelagatti and Ozsu and Valduriez.)
2. Database Performance Measurement Techniques
3. Knowledge Management (Knowledge representation, recursive query processing, rule processing and optimization, etc.)
4. User Interfaces and data visualization.
5. Concurrency Control and/or recovery algorithms for specific applications - e.g., mobile databases, engineering design databases. (see a book by Bernstein et al.).
6. Active Databases (see book by Widom and Ceri).
7. Different aspects of Object-oriented Database Management (e. g., query languages, theoretical models, storage organization). (Look at a readings book by Zdonik and Maier or books on reserve.).
8. Database management for CAD/CAM and manufacturing applications
9. Geographic Information Systems- database issues
10. Office Information Systems- database issues. (Look at issues of ACM Trans. of Office Information Systems. The word "office" has been recently dropped from the title.)
11. Database Security - security models, security implementation, relationship to web databases.
12. Parallel Databases: architecture, query processing, join algorithms, performance. (consult Prof. Omiecinski for references.)
13. Temporal Databases - language issues, storage and transaction management (consult Prof. Mark).
14. Multimedia Database Management
15. Distributed Database Design, Redesign, Reorganization.
16. Database management issues on the web.
17. Workflow modeling and process modeling - techniques and tools.
Students are welcome to propose any other interesting topics. Papers should be 10-12 pages (double spaced, 12 font) and will be graded on the basis of :(i) content- amount of effort and breadth, (ii) understanding and synthesis of the topic shown by the author, (iii) analysis and depth that goes beyond just copying parts of references, and (iv) organization and presentation. Creative papers exploring new ideas or techniques coupled with some experimental proposals are most welcome.
Due date for term paper/project: April 29th – last class (a hard deadline.) Papers must be submitted in duplicate. (One copy will be returned). Each paper must have a detailed bibliography in the following style (a paper must consult at least 5 outside references).:
1.
T.
Wakayama, S. Kannapan, C. M. Khoong, S.B. Navathe, J. Yates (Eds.),Information and Process
Integration in Enterprises: Rethinking Documents, Kluwer Academic Publishers,
1998. [ An edited book].
2. S.B.
Navathe and R. Ahmed, "Temporal Extensions to the Relational Model and
SQL," Chapter
4 in "Temporal Database Management," (A. Tansel, et al., eds.),
Benjamin Cummings,
1993. [ A chapter from a book].
3. H. Beck, T.Anwar, and S.B. Navathe,
"A Conceptual Clustering Algorithm for Database Schema Design," in IEEE Transactions on Knowledge and Data
Engineering, Vol. 6, No. 3, June
1994. [ A paper in a journal].
4.
A. Savasere, E.
Omiecinski and S. B. Navathe, "Discovery of Multiple-Level Association
Rules from Large Databases," Proc.
21st Int. Conf. on Very Large Databases, Zurich,
Switzerland,
September 1995. [ A paper in a conference].
Grading:
Quizzes (in class): 50-55%.
Project or paper: 25%.
Final (cumulative addressing selected topics): 20-25%.
Quizzes will be multiple choice with numeric grading. Cumulative score on quizzes and on final will be converted to a letter grade. The term paper will be given a letter grade. The final grade will reflect a weighted average of these letter grades. (E.g., B is 3.0, B+ is 3.33, A- is 3.67, A is 4.0). Those with weighted total above 3.5 are candidates to receive an A. The instructor may use subjective judgment to adjust "border-line" cases up or down.
We will try to maintain a web page for this class at www.cc.gatech.edu/classes/index.html#2004. Since we do not have a T.A., please try to see the instructor after class and during office hours as much as possible. If you send email to the instructor related to the course, always put CS6400 in the subject header.
NOTES:
Initially, the material the instructor will use will be from the CS4400 webpages for Fall 2003:
http://www.cc.gatech.edu/classes/AY2004/cs4400_fall/sham_notes.html
(The main page for the course has a link to it under “Notes”).
Some of the notes will come from:
The CS4440 class in Fall 2003 where the notes are under “Topics and Materials”.
Tentative List of Sources for Research
Topics and Papers:
Proc. of SIGMOD Conference (SIGMOD): ACM-Special Interest Group on Mgmt. of Data (1974-)
Proc. of the Very Large Database(VLDB) Conference. (recent publisher: Morgan Kaufmann). (1975-)
Proc. of IEEE Data Engineering Conf. (ICDE) (1984-)
TODS: ACM Transactions on Database Systems.(1976-)
IEEE /TKDE: IEEE Transactions on Knowledge and Data Engineering (1990-)
TOIS: ACM Transactions on Office Information Systems.
Website for database publications by author and topics:
http://dblp.uni-trier.de (maintained by Dr. Michael Lay at Univ of Trier, Germany).
Tentative Schedule of Coverage:
A detailed schedule will be prepared and made available on the WebPages later.
We intend to cover almost the entire textbook.
Exceptions:
Chapters 21,22,24,26 (will not to be covered : partial coverage depending on time available)
Chapters 9,12 will be covered in a summary form.
Quizzes will be given during classtime, will be multiple choice unless otherwise announced.
For a former offering of this course (taught by me) look at the Spring 2002 WebPages for the course.