If my servlet or JSP contains international characters (like "Ýðüö"), how do I ensure they are displayed correctly?
I have written a method, htmlescape(), that converts a Unicode string (one containing two-byte Unicode characters) into the HTML-escaped equivalent. It is available at http://www.purpletech.com/code/src/com/purpletech/util/Utils.java. (If the site is currently unavailable, please send me email.)
Another option is to set the character encoding for the response and output binary (Unicode) data directly. From the Javadoc for ServletResponse:
In addition, the international version of the JDK/JRE is required to have all the encodings available.
The charset for the MIME body response can be specified with
setContentType(java.lang.String. For example, "text/html; charset=Shift_JIS". The charset can alternately be set using
setLocale(java.util.Locale). If no charset is specified, ISO-8859-1 will be used. The
setLocalemethod must be called before
getWriterfor the charset to affect the construction of the writer.
See the Internet RFCs such as RFC 2045 for more information on MIME. Protocols such as SMTP and HTTP define profiles of MIME, and those standards are still evolving.
- What is the difference between URL encoding, URL rewriting, and HTML escaping?
- character encoding
- Supported Encodings
- How to write a servlet which can transfer data written in ISO8859-1 into GB2312 or another kind of character encoding?