Do Lucene applicable to text analyst for IRC?? how??
0 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   plumbird_MY
Posted On:   Tuesday, January 18, 2005 06:21 AM

Im using Lucene1.4.2 to build my system. But my system is not build a search engine like wat lucene web done that search result is displayed based on the query from user. What im going to build is a text analyst that i ll retrieve the chat discussion from online Internet Relay Chat (IRC) and analyze it to get the topic discussion of the chatrooms. At first, i ll analyse it using pre-processing process ,ie: stopping and stemming.. And, tis is done using the Lucene1.4.2. I manage to get the stem words from the chatrooms and store it in database. But my problem is the steps after the stopping and stemming process. That i duno how is the coding to add documents for every chatroom and get the document term frequency matrix for   More>>


Im using Lucene1.4.2 to build my system. But my system is not build a search engine like wat lucene web done that search result is displayed based on the query from user.
What im going to build is a text analyst that i ll retrieve the chat discussion from online Internet Relay Chat (IRC) and analyze it to get the topic discussion of the chatrooms.


At first, i ll analyse it using pre-processing process ,ie: stopping and stemming.. And, tis is done using the Lucene1.4.2. I manage to get the stem words from the chatrooms and store it in database. But my problem is the steps after the stopping and stemming process. That i duno how is the coding to add documents
for every chatroom and get the document term frequency matrix for that.. and also calculate the term weight and inverse document frequency (idf) and presents the document-terms weight in matrix..



My question is do Lucene 1.4.2 possible to do that? If yes, could anyone pls giv me some sample code to do tat.. i had read the package org.lucene.analysis.index.. there are TermFreqVec, TermFreq class sth like tat.. bt im nt really udrstd the implementation of these classes,
how do i cal them to suit my system.. Which class or interface should i call first.. and the steps as well...how is the codes to add documents??



Thx to those kindly for help and reply..

   <<Less
About | Sitemap | Contact