How can I serve a "Linearized" PDF file from a servlet?
Created May 4, 2012
In order to make the document appear to load more quickly, the Acrobat Reader relies on two features. First, you must of course make sure the PDF file is in the linearized format which the author of the PDF should be able to do. Second, since you are serving the file from a servlet, you must support the HTTP protocol features that Acrobat is depending on.
In short summary, when Acrobat receives the first part of file it can determine that it is in linearized form. This enables it to cache the tables that are present at the beginning of the file and then close the connection, preventing the entire document from downloading all at once. From then on Acrobat fetches each page or object as it is needed from the web server. It does this by using the HTTP byte range headers in requests, and your servlet needs to support these for it to work properly.
GET /example.pdf HTTP/1.1 Range: bytes=0-1023Here, the client is asking for only the first 1K of the file.
The key to making this work in HTTP 1.1 is in the way the server responds to such requests. If the server does not understand the byte range headers, it will reply in ignorance with the usual "200 OK" and send the entire file. But if it does understand the byte range headers, it replies with a "206 Partial Content" and sends only the requested portion of the file. Looking at the response code is how different client programs are able to display whether a server supports "resumable downloads." If you can program your servlet to support these requests for partial content, I think you'll get the result you want.
[Hmm, sounds a little dubious... Has anyone tried this? -Alex]