Project 5 - Program Description
DTI/RABID
Prepare
a pair of Java
programs: DTI (Document-Term
Indexer)
and RABID (Retrieve And
Browse Indexed
Documents) that enable the retrieval of documents from a corpus. A corpus
is a collection of text documents; for example, a collection of movie
reviews
from rec.arts.movies.reviews can
be found here. The first
program analyzes the corpus and produces an index. The second program
takes
user requests and uses the index to provide access to the documents.
Make
sure that your programs conform
to the following development constraints:
- You may write the programs on any
machine for which you have the Java
tools. But the programs will be tested on the College of Computing tampere
machine using the version of Java in /usr/local/public/packages/jdk.latest on tampere
Submit your deliverables via the class Swiki. Use
the rules for placing incremental releases described in the Process Description document.
- All code required to execute the
programs that is not part of the Java
distribution on tampere must be turned in in source format.
Your programs must compile using the javac
command without any options being specified.
- Your programs should be applications
not applets. That
is, they must execute from the command line using the java command
and not from a web browser. Even though
they are applications, they are allowed to use a graphical user
interface
but are not required to do so.
The customer
needs for this program are provided here. There are separate,
process-related things you must do on this project, which are described
here. For
example, you must prepare Process
Plan and Process Assessment
documents, and you must prepare various development
documents.
The teams for
this project are described here.