|
|||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--MainHw4
This is a skeleton of a main program for implementing a mini-search engine on text files. This code provided takes care of the following:
WordGrabber
to read words from the files
one at a time.For the Math 176 programming assignment you will need to implement the following functions (see the programming assignment web page for more information):
Constructor Summary | |
MainHw4()
|
Method Summary | |
static boolean |
getCommand()
Reads in one of three commands |
static boolean |
getTwoSearchWords()
Read two search words. |
static void |
main(java.lang.String[] args)
|
static void |
prettyPrint(java.lang.String s)
Prints out a long string in 80 column width. |
static void |
printExtractWithTwoWords(int docNo,
int posA,
int posB)
Prints a two line extract from a file, containing the words at the indicated position. |
static void |
printFrequentWords()
THIS IS SOME OLD CODE that I used to gather the common word file. |
static void |
readWordsFromFiles()
This is a demo routine that shows how to read words one at a time from the files. |
Methods inherited from class java.lang.Object |
clone,
equals,
finalize,
getClass,
hashCode,
notify,
notifyAll,
toString,
wait,
wait,
wait |
Constructor Detail |
public MainHw4()
Method Detail |
public static void main(java.lang.String[] args) throws java.io.IOException
public static void readWordsFromFiles() throws java.io.IOException
printFrequentWords
.public static void printFrequentWords() throws java.io.IOException
The result was that 5,000,000 occurences of words was reduced to 3,500,000+ occurences of non-common words, about a 1/3 reduction in the words that need to be considered.
Note that I used a HashMap to keep a counter associated with each word. I had to store Integers as values, rather than storing ints as values, since ints are not Objects in Java.
public static boolean getTwoSearchWords() throws java.io.IOException
Returns values in wordA
and wordB
and in
wordAwildCard
and wordBwildCard
.
public static boolean getCommand() throws java.io.IOException
public static void prettyPrint(java.lang.String s)
public static void printExtractWithTwoWords(int docNo, int posA, int posB)
prettyPrint
. The extract
will fit into two or three 80 column printed lines.docNo
- The document or file number in which the pair of words appear.posA
- The position of word A in the file (measured in bytes).posB
- The position of word B in the file (measured in bytes).
|
|||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||
SUMMARY: INNER | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |