Dowser Class Extractor - Extraction Rules

Daniel Popescu
Georgia Institute of Technology

Introduction

This document describes the implemented rules of the Dowser ClassExtractor.

Contents

Rules

The following rules are currently implemented.

Direct Object Rule

                     +-------Os------+    
 +---Ds--+-----Ss----+        +--Ds--+    
 |       |           |        |      |    
 the elevator.n illuminates.v the button.n .


If a sentence contains an S link and an O link then create a class from the subject noun and one from the object noun. Afterwards, create a directed association from the subject to the object class, which is named after the verb. Additionally, this rule is only applicable, if the Has Rule cannot be applied.


This rule produces Figure 1 from the example sentence.

Figure 1:
Image figDirectObject.png

Has Rule

    +---Ds--+---Ss--+---Op--+     
    |       |       |       |     
   the elevator.n has.v buttons.n .


This rule is only applicable, if a parsed sentence contains a subject and an object link and the verb stem is either have, possess, contain, or include. In that case, the object is aggregated to the subject.


This rule produces Figure 2 from the example sentence.

Figure 2:
Image figHas.png

Bottom Up Classification Rule

               +---Ost--+    
    +--Ds-+---Ss--+  +--Ds-+    
    |     |       |  |     |
   an up-button is.v a button.n .


This rule is only applicable, if a parsed sentence contains a subject and an object link and the verb stem is equal to be. Furthermore, the subject needs an indefinite article.


This rule produces Figure 3 from the example sentence.

Figure 3:
Image figBottomUp.png

Passive With Object Rule

                            +-----Js----+ 
   +--Ds--+--Ss--+---Pv--+--MVp--+  +---Ds---+ 
   |      |      |       |       |  |        | 
   the button.n is.v connected.v to the controller.n .


From the sentence two classes are created from the subject noun and the noun of the prepositional phrase (J link). The passive verb and the connecting word to to the prepositional phrase describe the association.


This rule produces Figure 4 from the example sentence.

Figure 4:
Image figPassiveWithRule.png

Passive Rule

     +--Ds--+--Ss--+--Pv--+     
     |      |      |      |     
    the button.n is.v pressed.v .


One class is created from the subject noun. The passive verb is a state and therefore an attribute of the created class.


This rule produces Figure 5 from the example sentence.

Figure 5:
Image figPassive.png

Infinitive Object Rule

    +-------Ds-------+                              +---Js---+     
    |      +----AN---+--Ss--+-TO+-Ix+---Pv--+--MVp--+ +--Ds--+     
    |      |         |      |   |   |       |       | |      |     
   an elevator.n system.n is.v to be.v installed.v in a building.n .


This rule is applicable if the sentence contains the phrase is to be or are to be. If it is applicable, a class is created from the subject link noun and from the noun of the prepositional phrase. The passive verb and the connecting word to the prepositional phrase describe the association.


This rule produces Figure 6 from the example sentence.

Figure 6:
Image figInfinitive.png

Becomes Rule

                     +-----Os----+     
    +--Ds--+----Ss---+     +--Ds-+     
    |      |         |     |     |    
   an engineer.n becomes.v a manager.n .


This rule is only applicable, if a parsed sentence contains a subject and an object link and the verb stem is equal to become. The verb become indicates a role of the subject, which can be modeled by an attribute.


This rule produces Figure 7 from the example sentence.

Figure 7:
Image figBecomes.png

With Rule

     +-------Ds------+      +----Jp----+       
     |      +---AN---+--Mp--+   +--Dmc-+       
     |      |        |      |   |      |       
    the control.n panel.n with the buttons.n ...


This rule is activated by the key word with. If the key word with connects two nouns, it indicates an aggregation.


This rule produces Figure 8 from the example sentence.

Figure 8:
Image figWith.png

Subject Rule

 +---------------------CO*s---------------------+         
 +-----------------Xc----------------+          |         
 +---Cs---+       +------Os-----+    |          |         
 |  +--Ds-+---Ss--+      +--Ds--+    |  +---Ds--+---Ss---+
 |  |     |       |      |      |    |  |       |        |
if the user.n presses.v the button.n , the elevator.n moves.v


Sometimes subclauses describe actions without the need of an object. This can happen if the system reacts to a given event. An event clause starts with an if or a when (Cs or Cp link). If the system detects an event clause and the main clause has only a subject link, then a class from the subject link noun is created and the verb is added to the new class as a method.


This rule produces Figure 9 from the example sentence.

Figure 9:
Image figSubject.png

Genitive S Rule

                    +-------------Os------------+   
    +--Ds--+---Ss---+     +---Ds--+---YS--+--Ds-+   
    |      |        |     |       |       |     |   
   the system.n stores.v the customer.n 's.p name.n .


If this system detects a genitive caused by the signal 's, it creates two classes with one linking aggregation. In this example, it would create a class customer and it would aggregate the class name to the class customer.


This rule produces Figure 10 from the example sentence.

Figure 10:
Image figGenetive.png

Genitive Of Rule

                    +-----Os----+    +----Js----+   
    +--Ds--+---Ss---+     +--Ds-+-Mp-+  +---Ds--+  
    |      |        |     |     |    |  |       | 
   the system.n stores.v the name.n of the customer.n .


If this system detects a genitive caused by the signal of, it creates two classes with one linking aggregation. In this example, it would create a class customer and it would aggregate the class name to the class customer.


This rule produces Figure 10 from the example sentence.

Amount Of Rule

                   +------Os-----+               
    +--Ds-+---Ss---+      +--Ds--+--Mp-+-Jp-+    
    |     |        |      |      |     |    |    
   the user.n requests.v the amount.n of money.n .


If a sentence contains an S link, an O link and the O link is connected to the word amount, then create a class from the subject noun and one from the prepositional noun, which is connected to the word amount. Afterwards, create a directed association named after the verb.


This rule produces Figure 11 from the example sentence.

Figure 11:
Image figAmountOf.png

Implementing a new rule

Abstract Class: ClassExtractionRule

All rules are implemented as subclasses of the abstract class ClassExtractionRule. Every rule must implement the two pure virtual methods.
	/*
	 * This method checks if a concrete rule is applicable to a
	 * given sentence*/
	virtual bool isThisRuleApplicable(Linkage linkage, 
			int currentWord, Sentence sent)=0;

	/*
	 * This method applies the concrete rule and add the extracted
	 * new classes to the classMap.*/ 
	virtual void applyThisRule(Linkage linkage, int currentWord,
		       	Sentence sent, map<string, DomainClass>& classMap)=0;
The main controller checks for every word in a sentence if any rule is applicable on this word. If a rule returns true for the method isThisRuleApplicable, the main controller runs applyThisRule of this rule. Therefore, the implementation of applyThisRule should extract the classes and associations. applyThisRule assumes that it only called, if the rule is applicable. Therefore, it will find all necessary links.


After a rule is created, it has to be installed in the method setupRules() of the class ClassExtractor. Additionally, the compilation files have to be entered into the Makefile.

An example rule

To illustrate how to develop a new rule, an example rule is implemented. The example rule will be similar to the Has Rule. This rule will be applied, if a parsed sentence contains a subject and an object link and the verb stem is equal to contain. In that case, the object is aggregated to the subject.
    +-Ds-+---Ss--+---Op--+     
    |    |       |       |     
   the box.n contains.v buttons.n .


Creating the stubs

The folder cppsrc contains all source files of the Dowser ClassExtractor. Furthermore, it contains a helper script to create new rules. This script creates the needed stubs for a new rule. It was developed in the language Ruby and needs a Ruby interpreter. The following command invokes the script.
ruby newRule.rb
The script asks for a name for the new rule. In this example, the new rule is called ContainsRule. The script creates the file ContainsRule.h and ContainsRule.cpp.


The header file should look like the following file. It includes already all necessary information.

#ifndef CONTAINSRULE_H
#define CONTAINSRULE_H
#include "ClassExtractionRule.h"
extern "C" {
#include "link-includes.h"
}

class ContainsRule : public ClassExtractionRule 
{
protected:

	bool isThisRuleApplicable(Linkage linkage, 
			int currentWord, Sentence sent);

	void applyThisRule(Linkage linkage, int currentWord, Sentence sent,  
			map<string, DomainClass>& classMap);

public:
	ContainsRule(){
		ruleName ="ContainsRule";
	}

};
#endif  // CONTAINSRULE_H


Before editing the body file should look like the following file.

#include "ContainsRule.h"
#include "string.h"
#include "DomainClass.h"
#include "Association.h"
#include <string>
extern "C" {
#include "check.h"
}

bool ContainsRule::isThisRuleApplicable(Linkage linkage,
		int currentWord, Sentence sent){

return false;}

void ContainsRule::applyThisRule(Linkage linkage, int currentWord,
		Sentence sent, map<string, DomainClass>& classMap){

}

Implementing isThisRuleApplicable

First, the method isThisRuleApplicable is implemented. The ContainsRule is applicable if a verb has an S link, an O link and is equal to either contains or contain.


The code for the method looks like the following:

bool ContainsRule::isThisRuleApplicable(Linkage linkage,
		int currentWord, Sentence sent){
   int subject = getConnectedWord(linkage, currentWord, "S");
   int object = getConnectedWord(linkage, currentWord, "O");
   return (subject >= 0 && object >= 0 && 
      strcmp( sentence_get_word(sent, currentWord), 
         "contain") == 0 ||
      strcmp( sentence_get_word(sent, currentWord),
         "contains") == 0);
}
The variable subject returns the position of the connected word in the sentence. If neither an Ss or Sp link is connected to the current word it return -1.

Implementing the extraction method

This body implementation extracts the subject and object noun as new classes. Afterwards, it aggregates the object to the subject class.
void ContainsRule::applyThisRule(Linkage linkage, int currentWord,
		Sentence sent, map<string, DomainClass>& classMap){
   DomainClass& domainClass = getDomainClass("S", currentWord,
                                             linkage, sent, classMap);

   DomainClass& aggregatedClass = getDomainClass("O", currentWord,
                                             linkage, sent, classMap);

   map<string, DomainClass*> aggregations = domainClass.getAggregations();
   aggregations[aggregatedClass.getName()] = &aggregatedClass;
   domainClass.setAggregations(aggregations);

   //classMap must contain all extracted classes
   classMap[aggregatedClass.getName()] = aggregatedClass;
   classMap[domainClass.getName()] = domainClass; 
}
The new rule is completed now. As a next step, it must be registered in the main program.

Adding the new rule to the main program

First, the file ClassExtractor must be edit to add the new rule. ClassExtractor must include the header file of the new rule and the method setupRules must also include the new rule.


Second, the Makefile has to be edited. The new rule has to be added to the OBJECTS variable and as a new rule of the Makefile. Therefore, the OBJECTS variable contains the line ${OBJ}/ContainsRule.o and the Makefile contains the following lines:

${OBJ}/ContainsRule.o: ${CPPSRC}/ContainsRule.c ${INCLUDES}
   ${CC} ${CFLAGS} -I${INC} ${CPPSRC}/ContainsRule.c -o ${OBJ}/ContainsRule.o
The second line must contain two tabulators at the beginning.


As a result, the new rule is added to the system after recompiling.

About this document ...

Dowser Class Extractor - Extraction Rules

This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.49)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html rules.tex -split 0 -dir biblio -no_navigation

The translation was initiated by Daniel Michael Popescu on 2005-12-15


Daniel Michael Popescu 2005-12-15