Posted By:
BG_San
Posted On:
Tuesday, May 18, 2004 01:23 PM
The search engine I am working on is to index lists of sale item listings that include fields like title, description, item ID and seller user ID. The relevancy and word matching using case-insensitivity and stemming in the following Analyzer works well. But I just do not want to tokenize and stem the seller user ID field, which is an alphanumeric, no space string that would need exact phrase matching. I notice that Lucene's IndexWriter only take one analyzer per index. So I cannot index the seller user ID with exact phrase match. I have heard a suggestion that I may create another index with no-stemming and do multi-index search when I need to do a query with the seller ID in the search criteria. But I'm not sure of the performance this m
More>>
The search engine I am working on is to index lists of sale item listings that include fields like title, description, item ID and seller user ID. The relevancy and word matching using case-insensitivity and stemming in the following Analyzer works well. But I just do not want to tokenize and stem the seller user ID field, which is an alphanumeric, no space string that would need exact phrase matching.
I notice that Lucene's IndexWriter only take one analyzer per index. So I cannot index the seller user ID with exact phrase match. I have heard a suggestion that I may create another index with no-stemming and do multi-index search when I need to do a query with the seller ID in the search criteria. But I'm not sure of the performance this multi-index search. What are some good solutions to this problem?
Analyzer analyzer = new TextSearchAnalyzer();
IndexWriter writer = new IndexWriter(indexDir, analyzer, newIndex);
public class TextSearchAnalyzer extends Analyzer
{
public final TokenStream tokenStream(final Reader reader)
{
TokenStream result = new StandardTokenizer(reader);
result = new StandardFilter(result);
result = new LowerCaseFilter(result);
result = new StopFilter(result, stopTable);
result = new PorterStemFilter(result);
return result;
}
}
<<Less