Posted By:
Benoit_Quintin
Posted On:
Tuesday, July 27, 2004 12:34 PM
Well PPT is a binary format, while Lucene really is equipped to deal with plain text... Maybe you could parse the ppt with Apache's POI project, then get the raw text into an arbitrary xml format, then inject that format into Lucene, with different fields (like filename, text content, file size, etc.)... Then you could use those indexed fields as search parameters. But searching THROUGH PPT files, I don't know how you can do that...