If my servlet or JSP contains international characters (like "Ýðüö"), how do I ensure they are displayed correctly?
Created May 4, 2012
Ýðüöor
Ýðüö
I have written a method, htmlescape(), that converts a Unicode string (one containing two-byte Unicode characters) into the HTML-escaped equivalent. It is available at http://www.purpletech.com/code/src/com/purpletech/util/Utils.java. (If the site is currently unavailable, please send me email.)
Another option is to set the character encoding for the response and output binary (Unicode) data directly. From the Javadoc for ServletResponse:
In addition, the international version of the JDK/JRE is required to have all the encodings available.The charset for the MIME body response can be specified with
setContentType(java.lang.String
. For example, "text/html; charset=Shift_JIS". The charset can alternately be set usingsetLocale(java.util.Locale)
. If no charset is specified, ISO-8859-1 will be used. ThesetContentType
orsetLocale
method must be called beforegetWriter
for the charset to affect the construction of the writer.See the Internet RFCs such as RFC 2045 for more information on MIME. Protocols such as SMTP and HTTP define profiles of MIME, and those standards are still evolving.
See also
- http://hotwired.lycos.com/webmonkey/reference/special_characters/
- What is the difference between URL encoding, URL rewriting, and HTML escaping?
- character encoding
- Supported Encodings
- How to write a servlet which can transfer data written in ISO8859-1 into GB2312 or another kind of character encoding?