The overall objective of this project is to try out some advanced database implementation techniques on top of existing database engines. You will be given some open source software packages (in Java). The project requires students to fully understand the open source database engine implementation. You are expected to enhance the existing package by adding more advanced database features (e.g., add aggregate functions, query optimization methods, support larger data files, use additional access methods).
The project will be implemented in two phases. The first phase will focus on two components: (1) the study and understanding of the existing software and report its implementation details (including limitations); and (2) the design decision on which three advanced database features that you intend to add on top of this open source software package and why they are important features. You may use applications you know to justify your decision.
The second phase will extend the open source database engine to provide richer database features. You are required to provide the conceptual and implementation design for each of the three advanced features identified in the Phase I report. Then you are expected to implement these features on top of the given open source database software. Please bear in mind that your implementation should have a user-friendly GUI to demonstrate the new database functionalities.
The project has two milestones:
Now we describe the detailed requirements for each of the two phases below.
PHASE I:
For the project, you can choose one of the two suggested open source database implementations (both in Java) below:
SELECT (with joins) UPDATE INSERT DELETE CREATE TABLE DROP TABLE ALTER TABLE xx ADD coldef dBase only ALTER TABLE xx DROP [COLUMN] col dBase only ALTER TABLE xx RENAME [COLUMN] foo TO bar dBase only WHERE only AND is supported
tinySQL now supports all JDBC-Data types and the ODBC-Minimum grammar is implemented. Although tinySQL is not optimized for speed, its architecture makes it possible to build SQL interfaces to non-SQL data sources, such as .DBF (dBase database file format) files or text files.
hsqldb is a relational database engine written in Java, with a JDBC driver, supporting a rich subset of ANSI-92 SQL (BNF tree format). It offers a small (less than 160k), fast database engine which offers both in memory and disk based tables. Embedded and server modes are available. Additionally, it includes tools such as a minimal web server, in-memory query and management tools (can be run as applets) and a number of demonstration examples.
hsqldb is more complex than tinySQL database engine. It has many features, including view support, aggregate functions, triggers, multi-dimensional indexes, and most JDBC interface support. For more details, please visit the software's home page [2].
You are required to study the existing database engine and give a thorough report on how the engine is implemented. For example, you can report details on how the engine handle the following:
You will also need to identify the limitations of the existing software so that you can improve some of them in Phase II of the project.
In addition to the study and understanding of the existing software and report its implementation details (including limitations), you are required to describe your design decision on which three advanced database features that you intend to add on top of this open source software package and why they are important features. You may use applications you know to justify your decision. For example, if you intend to add query optimization component and index to tinySQL, you will need to introduce commands such as CREATE INDEX, LOAD INDEX LOAD RSTATES, LOAD ISTATS in order to collect and maintain statistics about all relations created and populated in the systems catalog. You may refer to the Project Option 1 description to learn more on how to design and implement such commands. At Phase I of this project, you are expected to accomplish the following:
IMPORTANT:
As a milestone of the project, we require that each project team produce a
Phase I report documenting all five requirements listed above. You are not
required to turn in any code for Phase I.
Due date: Tuesday of the 8th Week, which is February 25
2003.
PHASE II:
A main objective of the second phase is to extend the open source database engine to provide richer database features. You are required to improve the existing package in at least 3 aspects. Some suggested aspects include:
You are required to provide the conceptual and implementation design for each of the three advanced features identified in the Phase I report. You are expected to provide the best possible implementation of the three features on top of the chosen open source database software. At the end of Phase II, you will need to submit a written report on the design and implementation considerations, the system architecture, and the design choices for the improved database engine. In addition, you are expected to submit the source code of your system and the README instruction file. The source code should be well documented and it should describe which team member made what changes to the original package. A demo will be scheduled for your final product in the last week of the semester. Each team member should hand in a page describing the tasks you have done and your contributions to the team project and your evaluation on the rest of the team members. You may send this page by email to the instructor and the TA separately. Your project grade will be based on the reports from the two phases, the demo quality, and the quality of source code.