nextupprevious
Next:Bibliography

CS 4420 DATABASE PROJECT
(Option 2)
 
 

Spring Semester 2004


 
 

The overall objective of this project is to try out some advanced database implementation techniques on top of existing database engines. You will be given some open source software packages (in Java). The project requires students to fully understand the open source database engine implementation. You are expected to enhance the existing package by adding more advanced database features (e.g., add aggregate functions, query optimization methods, support larger data files, use additional access methods).

The project will be implemented in two phases. The first phase will focus on two components: (1) the study and understanding of the existing software and report its implementation details (including limitations); and (2) the design decision on which three advanced database features that you intend to add on top of this open source software package and why they are important features. You may use applications you know to justify your decision.

The second phase will extend the open source database engine to provide richer database features. You are required to provide the conceptual and implementation design for each of the three advanced features identified in the Phase I report. Then you are expected to implement these features on top of the given open source database software. Please bear in mind that your implementation should have a user-friendly GUI to demonstrate the new database functionalities.

The project has two milestones:

Milestone 1:
A report on Phase I is due on Thurday of the eighth week of the course, which is February 26, 2004. You may hand in your report in class on that day or email it to instructor and TA.
Milestone 2:
The software package of the entire product of your project and a final report will be produced. Each project team will schedule a demo date and time to demonstrate their product and walk through the code. The due date of your final project is the midnight of the Thursday of the last week of the semester, which is April 22, 2004. You should send in a compressed tar file including the source code with good documentation, the printout of your data used in the project, some sample screenshots, and the final report.
A detailed description of the deliverables for each of the two milestones will be made available online at the course web site.

Now we describe the detailed requirements for each of the two phases below.
 
 
 

PHASE I:

For the project, you can choose one of the two suggested open source database implementations (both in Java) below:

  1. tinySQL [1]

  2. tinySQL is a lightweight, 100% Java SQL engine that also includes a JDBC driver. It supports the following capabilities (from the README file):
    SELECT (with joins)                       
    UPDATE
    INSERT
    DELETE
    CREATE TABLE
    DROP TABLE
    ALTER TABLE xx ADD coldef                  dBase only
    ALTER TABLE xx DROP [COLUMN] col           dBase only
    ALTER TABLE xx RENAME [COLUMN] foo TO bar  dBase only
    WHERE                                      only AND is supported
    tinySQL now supports all JDBC-Data types and the ODBC-Minimum grammar is implemented. Although tinySQL is not optimized for speed, its architecture makes it possible to build SQL interfaces to non-SQL data sources, such as .DBF (dBase database file format) files or text files.
  3. hsqldb [2]

  4. hsqldb is a relational database engine written in Java, with a JDBC driver, supporting a rich subset of ANSI-92 SQL (BNF tree format). It offers a small (less than 160k), fast database engine which offers both in memory and disk based tables. Embedded and server modes are available. Additionally, it includes tools such as a minimal web server, in-memory query and management tools (can be run as applets) and a number of demonstration examples.

    hsqldb is more complex than tinySQL database engine. It has many features, including view support, aggregate functions, triggers, multi-dimensional indexes, and most JDBC interface support. For more details, please visit the software's home page [2].

You are required to study the existing database engine and give a thorough report on how the engine is implemented. For example, you can report details on how the engine handle the following: You will also need to identify the limitations of the existing software so that you can improve some of them in Phase II of the project.

In addition to the study and understanding of the existing software and report its implementation details (including limitations), you are required to describe your design decision on which three advanced database features that you intend to add on top of this open source software package and why they are important features. You may use applications you know to justify your decision. For example, if you intend to add query optimization component and index to tinySQL, you will need to introduce commands such as CREATE INDEX, LOAD INDEX LOAD RSTATES, LOAD ISTATS in order to collect and maintain statistics about all relations created and populated in the systems catalog. You may refer to the Project Option 1 description to learn more on how to design and implement such commands. At Phase I of this project, you are expected to accomplish the following:

  1. identify which open source product to use
  2. identify which basic database functions are supported by the chosen open source package
  3. identify what additional database features you would like to add to the top of chosen open source package
  4. for each feature, identify the list of basic components (e.g. commands or modules) that you need to write and incorporate into the chosen open source package
  5. identify potential benefits and difficulties for adding the new features to the chosen open source package.
IMPORTANT: As a milestone of the project, we require that each project team produce a Phase I report documenting all five requirements listed above. You are not required to turn in any code for Phase I.
Due date: Thurday of the 8th Week, which is February 26 2004.
 
 
 
 

PHASE II:

A main objective of the second phase is to extend the open source database engine to provide richer database features. You are required to improve the existing package in at least 3 aspects. Some suggested aspects include:

You are required to provide the conceptual and implementation design for each of the three advanced features identified in the Phase I report. You are expected to provide the best possible implementation of the three features on top of the chosen open source database software. At the end of Phase II, you will need to submit a written report on the design and implementation considerations, the system architecture, and the design choices for the improved database engine. In addition, you are expected to submit the source code of your system and the README instruction file. The source code should be well documented and it should describe which team member made what changes to the original package. A demo will be scheduled for your final product in the last week of the semester. Each team member should hand in a page describing the tasks you have done and your contributions to the team project and your evaluation on the rest of the team members. You may send this page by email to the instructor and the TA separately. Your project grade will be based on the reports from the two phases, the demo quality, and the quality of source code.
 



nextupprevious
Next:Bibliography
Wei Tang 2003-01-09