dcsimg
How to search PDF files in lucene....
1 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   sombir_kadian
Posted On:   Monday, July 18, 2005 05:31 AM

pls tell me from where i can download the required jar/class file to make the PDF files searchable?

If possible, pls do send me a suitable working example.

Re: How to search PDF files in lucene....

Posted By:   Richard_Krenek  
Posted On:   Wednesday, July 20, 2005 12:52 PM

http://www.pdfbox.org/ is a jar you will need along with Lucene. If you own the book "Lucene in Action" go to page 235 and it will walk you through step by step (with code) on how to make things work. Basically you have to extract the text from the PDF then index that into Lucene. One thing to be careful of is that Lucene, by default, only indexes the first 10000 terms. If you need more you need to explicetly set that.
About | Sitemap | Contact