Re: how to create the analyzer for multiple format files??
Wednesday, September 21, 2005 09:51 AM
You may want to check the book "Lucene in Action", it has examples. You do not need an Analyzer you need to first extract the text from your files then pass them through an Analyzer of your choice. For PDFs the pdfbox lib is good. Try a google search with these terms lucene pdf text extract which should get you started. Also take a look at Nutch.