Re: Indexing Encrypting PDF document
Posted By:
Richard_Krenek
Posted On:
Tuesday, April 26, 2005 06:08 AM
I cannot answer all your questions but yes the PDF you pointed to is set so you cannot copy text from it. Adobe's encryption is very easy to get around, but a better solution is to work with the people that provided the PDF and either get the passwords for the PDFs or have them give you unsecured PDFs. If the PDFs are secured and the provider does not give you unsecured PDFs or the password, it can be assumed they really do not want you to pull text from their PDFs.
As far as determining what fonts are in a PDF, I do not know off the top of my head how to do that, you may want to post to a PDF Box forumn on this issue. Nor do I know if there is a gerneral analyzer that can help you. You may need to write some specific code to do that.