org.jscience.linguistics.kif
Class WordNet

java.lang.Object
  extended by org.jscience.linguistics.kif.WordNet

public class WordNet
extends java.lang.Object

This program finds and displays SUMO terms that are related in meaning to the English expressions that are entered as input. Note that this program uses four WordNet data files, "NOUN.IDX" and "NOUN.EXC", "VERB.IDX" and "VERB.EXC", as well as two WordNet two SUMO mappings files called "MergeMappings.txt" and "WordNetMappings-verbs.txt" The main part of the program prompts the user for an English term and then returns associated SUMO concepts. There are two public methods: initOnce() and page().


Field Summary
static int ADJECTIVE
          DOCUMENT ME!
static int ADVERB
          DOCUMENT ME!
static boolean initNeeded
          DOCUMENT ME!
static int NOUN
          DOCUMENT ME!
static int VERB
          DOCUMENT ME!
static WordNet wn
          DOCUMENT ME!
 
Constructor Summary
WordNet()
           
 
Method Summary
 boolean containsWord(java.lang.String word, int pos)
          Does WordNet contain the given word.
 java.lang.String findSUMOWordSense(java.lang.String word, java.util.ArrayList words, int POS)
          Return the best guess at the synset for the given word in the context of the sentence.
 java.lang.String getSUMOterm(java.lang.String word, int pos)
          Get the SUMO term for the given root form word and part of speech.
static void initOnce()
          Read the WordNet files only on initialization of the class.
static void main(java.lang.String[] args)
          A main method, used only for testing.
 java.lang.String nounRootForm(java.lang.String mixedCase, java.lang.String input)
          Return the root form of the noun, or null if it's not in the lexicon.
 java.lang.String page(java.lang.String inp, int pos, java.lang.String sumokbname)
          This is the regular point of entry for this class.
 void readSenseIndex()
           
 void readWordFrequencies()
          Return a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet.
 java.lang.String verbRootForm(java.lang.String mixedCase, java.lang.String input)
          Return the present tense singular form of the verb, or null if it's not in the lexicon.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

wn

public static WordNet wn
DOCUMENT ME!


initNeeded

public static boolean initNeeded
DOCUMENT ME!


NOUN

public static final int NOUN
DOCUMENT ME!

See Also:
Constant Field Values

VERB

public static final int VERB
DOCUMENT ME!

See Also:
Constant Field Values

ADJECTIVE

public static final int ADJECTIVE
DOCUMENT ME!

See Also:
Constant Field Values

ADVERB

public static final int ADVERB
DOCUMENT ME!

See Also:
Constant Field Values
Constructor Detail

WordNet

public WordNet()
Method Detail

readWordFrequencies

public void readWordFrequencies()
Return a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet. The value is a HashMap of words and the number of times that word cooccurs in sentences with the word sense given in the key.


readSenseIndex

public void readSenseIndex()

findSUMOWordSense

public java.lang.String findSUMOWordSense(java.lang.String word,
                                          java.util.ArrayList words,
                                          int POS)
Return the best guess at the synset for the given word in the context of the sentence. Returns an 8-digit WordNet synset file byte offset as a string.

Parameters:
word - DOCUMENT ME!
words - DOCUMENT ME!
POS - DOCUMENT ME!
Returns:
DOCUMENT ME!

initOnce

public static void initOnce()
                     throws java.io.IOException
Read the WordNet files only on initialization of the class.

Throws:
java.io.IOException

nounRootForm

public java.lang.String nounRootForm(java.lang.String mixedCase,
                                     java.lang.String input)
Return the root form of the noun, or null if it's not in the lexicon.

Parameters:
mixedCase - DOCUMENT ME!
input - DOCUMENT ME!
Returns:
DOCUMENT ME!

verbRootForm

public java.lang.String verbRootForm(java.lang.String mixedCase,
                                     java.lang.String input)
Return the present tense singular form of the verb, or null if it's not in the lexicon.

Parameters:
mixedCase - DOCUMENT ME!
input - DOCUMENT ME!
Returns:
DOCUMENT ME!

getSUMOterm

public java.lang.String getSUMOterm(java.lang.String word,
                                    int pos)
Get the SUMO term for the given root form word and part of speech.

Parameters:
word - DOCUMENT ME!
pos - DOCUMENT ME!
Returns:
DOCUMENT ME!

containsWord

public boolean containsWord(java.lang.String word,
                            int pos)
Does WordNet contain the given word.

Parameters:
word - DOCUMENT ME!
pos - DOCUMENT ME!
Returns:
DOCUMENT ME!

page

public java.lang.String page(java.lang.String inp,
                             int pos,
                             java.lang.String sumokbname)
This is the regular point of entry for this class. It takes the word the user is searching for, and the part of speech index, does the search, and returns the string with HTML formatting codes to present to the user. The part of speech codes must be the same as in the menu options in WordNet.jsp and Browse.jsp

Parameters:
inp - The string the user is searching for.
pos - The part of speech of the word 1=noun, 2=verb, 3=adjective, 4=adverb
sumokbname - DOCUMENT ME!
Returns:
A string contained the HTML formatted search result.

main

public static void main(java.lang.String[] args)
A main method, used only for testing. It should not be called during normal operation.

Parameters:
args - DOCUMENT ME!