Posted By:
Alfred_Sniff
Posted On:
Tuesday, May 3, 2005 05:30 AM
Hi all, I've got some problems whilei ndexing my documents... I've got a Word file, 12,5Mo and 2200 pages. No problem for lucene to index. In the other hand, i've got a Word file of 25Mo and 1000 pages (lots of images), and this time i've got a problem : outOfMemory : java heap space. Is there an other limit on file size by default? I have an index with max_field_length to Integer.MaxValue, but I don't know if it comes from this I've got a second problem... I've got a little pdf which cannot be index... it's not crypted... When I copy-paste the text in a txt file, there's no problem for indexing it. Where is the problem? This comes from pdfbox? Thanks all B
More>>
Hi all,
I've got some problems whilei ndexing my documents... I've got a Word file, 12,5Mo and 2200 pages. No problem for lucene to index. In the other hand, i've got a Word file of 25Mo and 1000 pages (lots of images), and this time i've got a problem : outOfMemory : java heap space. Is there an other limit on file size by default?
I have an index with max_field_length to Integer.MaxValue, but I don't know if it comes from this
I've got a second problem... I've got a little pdf which cannot be index... it's not crypted... When I copy-paste the text in a txt file, there's no problem for indexing it. Where is the problem? This comes from pdfbox?
Thanks all
Best Regards
<<Less