XML suport
1 posts in topic
Flat View  Flat View

Posted By:   michal_goldenberg
Posted On:   Monday, May 30, 2005 10:30 AM

The documentation states that the Lucene search engine supports XML format.
What exactly does that iclude? does it know who to analyze XMLs (i.e. by a given path and return a path using the XML's tags)
or does it simply treat the XML file like any other text document?

Re: XML suport

Posted By:   Richard_Krenek  
Posted On:   Wednesday, June 1, 2005 08:02 PM

Lucene supports text format. Which means anything you can pull text from, you can index in Lucene, you just have to pull the text first. You probably would not want to place the XML directly into the index, but parse it first. You can write some code yourself or find some on the web. Here are 2 quick examples I found http://www-106.ibm.com/developerworks/java/library/j-lucene/ or http://lucene.apache.org/java/docs/contributions.html. This is common so there are probably a lot of other libraries people have already written to extract the text.
About | Sitemap | Contact