Posted By:
Otis_Gospodnetic
Posted On:
Friday, April 15, 2005 09:09 PM
Alfred, if I understand your concern correctly, then you first need to parse PDF, Word, and other rich-text documents. Then you need to index this text with Lucene. Erik and I developed a small framework that does just that, and you can see some references to it
here.
The code that comes with Lucene in Action book is free, so you can download it from
http://www.lucenebook.com/