If my servlet or JSP contains international characters (like "Ýðüö"), how do I ensure they are displayed correctly?

Alex Chaffee

The simplest way is to write your JSP (or strings in your servlet) to use the HTML entity escape characters for Unicode. For instance, the string "Ýðüö" would be represented as


I have written a method, htmlescape(), that converts a Unicode string (one containing two-byte Unicode characters) into the HTML-escaped equivalent. It is available at http://www.purpletech.com/code/src/com/purpletech/util/Utils.java. (If the site is currently unavailable, please send me email.)

Another option is to set the character encoding for the response and output binary (Unicode) data directly. From the Javadoc for ServletResponse:

The charset for the MIME body response can be specified with setContentType(java.lang.String. For example, "text/html; charset=Shift_JIS". The charset can alternately be set using setLocale(java.util.Locale). If no charset is specified, ISO-8859-1 will be used. The setContentType or setLocale method must be called before getWriter for the charset to affect the construction of the writer.

See the Internet RFCs such as RFC 2045 for more information on MIME. Protocols such as SMTP and HTTP define profiles of MIME, and those standards are still evolving.

In addition, the international version of the JDK/JRE is required to have all the encodings available.

See also