How can I index JSP files?

Otis Gospodnetic

To index the content of JSPs that a user would see using a Web browser, you would need to write an application that acts as a Web client, in order to mimic the Web browser behaviour. Once you have such an application, you should be able to point it to the desired JSP, retrieve the contents that the JSP generates, parse it, and feed it to Lucene.

How to parse the output of the JSP depends on the type of content that the JSP generates. In most cases the content is going to be in HTML format, so you should read How can I index HTML documents? FAQ entry.

Most importanly, do not try to index JSPs by treating them as normal documents, normal files in your file system. In order to index JSPs properly you need to access them via HTTP, acting like a Web client.