For this project, you are welcome to use any modern language on a machine of your choice, as long as its possible for the TA to easily run the program for validation. Common Lisp and Scheme are particularly well suited to this project, especially since the input information uses a syntax which is particularly convenient for processing by these languages. However, C, C++, Java, Tcl, are also fine languages to use for this project, and reading in the input information should be fairly simple in these languages, too. The only restriction is that you must turnin a program that runs on one of the platforms available at the College of Computing and/or on Acme such as Solaris, Linux or Windows. Because the TA is particularly Windows inclined, you may submit Windows 95/98/ME programs under development environments not necessarily at the CoC as long as you clear it with him ahead of time and provide instructions for running. Keep in mind that the TA, although an accomplished computer scientist, is not familiar with every language and every platform, and thus may not be able to assist you as much on certain platforms. Do check with the TA if you plan to use an unusual language or operating system.
The class website contains links to several supplemental files for this
project. In addition to this document,there are files called stool.in,
stool.out, flashlight.in, and flashlight.out . The two .in files
are example inputs to the program. The two .out files demonstrate
what the proper outputs for this program should look like. If you
run the program on one of the .in files, it should produce output which
matches the corresponding .out file. The stool example is described
in detail in the following section. The flashlight example is somewhat
larger and demonstrates the desired behavior more thoroughly.
No
Yes
SEAT1
None
L1
The first seven lines of the input file contain facts about the stool. The stool consists of a seat named SEAT1 and three legs named L1, L2, and L3 .The remaining five lines of the input file contain queries about the stool. The five l nes of output are answers to these five queries in order. The first query asks if L1 is connected to L2. Since there is nothing in the facts given that states that they are connected, the line No is printed to the screen. The next query asks if L1 is connected to SEAT1; it is, so Yes is printed. The third query asks for a value; in particular it asks for a value which is a seat. Since SEAT1 is the only seat, SEAT1 is printed. The next query is a little more complicated. It asks for a value that is a leg and is connected to both L1 and SEAT1. Since no such value exists, None is printed. The last query is the most complicated one of all to answer. It asks for a value which is a leg and is connected to SEAT1. Since three such values exist, the program needs to choose one. In general, you want to return the value which is most closely related to the topics which have been recently addressed in the input. Specifically here, two of the last three queries specifically asked about L1 and none of the other legs have been asked about or provided as answers for several lines. Thus L1 is the most relevant node which correctly satisfies the query and should be chosen here.
To determine the most relevant node, you should use spreading activation. Each node that you have in memory should have an activation level, i.e., a number which indicates how relevant it is to the current context. Each time a fact or a query refers to an node in memory, you should increase the activation level of the node. Also, when the activation level of one node is increased, the activation levels of other nodes that relate to it should also increase (but not by as much). Furthermore, the activation levels of nodes in memory should decay over time; if a node is referred to many times in a row, it should be highly activated for queries that occur immediately after those references but should be less highly activated if it is accessed later on. In the example, L1 is obviously the most activated node in memory which satisfies the query since it is the only one which has been mentioned in recent queroes. The other two legs should also be somewhat activated. In particular, since SEAT1 was mentioned in the previous query, it should have received some activation at that time and some of this activation should have spread to those nodes which are connected to it (including L2 and L3). Furthermore, these legs may also retain some activation from the previous times in which they were explicitly referred to in earlier facts and queries. However, the combination of these smaller activation levels should definitely not be enough to outweigh the relatively high activation that L1 has from its recent explicit references.
Note that this description of how spreading activation works is deliberately
vague. Many questions are left open. How quickly do activation
levels decay? How much activation should spread from a node to a
related node? If activation spreads to an node, should that additional
activation spread to other nodes adjacent to it? If so, how much,
and when should it stop? You need to devise your own answers to questions
of this sort.
(FACT (IS-A node type))
(FACT (CONNECTED node node))
You may assume that any node which is referred to in a CONNECTED fact has first been defined in a IS-A fact. You may also assume that there are no nodes which have more than one type (however, there may be nodes which are connected to more than one other node).
Your program should be able to process the following sorts of queries:
(QUERY (IS-A node type))
(QUERY (CONNECTED node node))
(QUERY VALUE (IS-A node VALUE))
(QUERY VALUE (CONNECTED node VALUE))
(QUERY VALUE (CONNECTED VALUE node))
(QUERY VALUE (AND ...))
The remainder of a query with AND should be a sequence of two or more
terms which have the same form as the individual value queries. Note
that you should treat the CONNECTED relation as symmetric; i.e., if A is
connected to B then B is connected to A. The flashlight example provided
on the class website demonstrates these desired features of your project.
You should read that example over carefully and make sure you understand
why the program gives the answers that it does for each query before you
begin programming. If it simplies matters for you, you may assume
that all inputs appear in all capital letters and that all facts appear
before any queries do. You may also assume that there are no extraneous
spaces or blank lines. You will not be graded on the basis of obscure
“trick ”problems such as ways to abuse the syntax and/or provide meaningless
input information. As long as your program is able to handle inputs
of the form that the test files provided, it should be fine.