Posted By:
Mark_Rose
Posted On:
Tuesday, August 14, 2001 05:10 PM
Both browsers use whatever encoding the page was sent in.
(Specified in either the Content-Type HTTP header or the
HTML element.)
The characters you are receiving are definitely not UTF-8
encodings of Unicode
values. They are probably the Windows-1252 encodings of
the "box drawing characters," but I'm not sure. UTF-8 encodings
of Unicode values larger than 0x7F always start with a first byte of 0xC0 or
higher.
Suggestions:
- specify the encoding of the page using the
Content-Type or techniques; - include on the form
a hidden field that includes the encoding (called "charset" below); - use a modified
version of your technique to get the values:
String converted =
new String(
request.getParameter("field").getBytes("ISO-8859-1"),
request.getParameter("charset")
);
Note: The Servlet 2.3 spec makes this easier--and different--but few containers support it yet.
The code above will work in a Servlet 2.3 container.