Unicode string problem.
3 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Eric_Chow
Posted On:   Wednesday, August 28, 2002 07:59 PM


			
Hello,

If a file contains some Unicode String(u4e08 u4e11),

how can I read the file and display the real character that represents by the unicode(u4e08 u4e11) ???

Best regards,
Eric



Re: Unicode string problem.

Posted By:   eimi_nos  
Posted On:   Friday, October 11, 2002 03:42 PM

Try the package "com.ibm.icu.text" in the icu4j downloadable here.


http://oss.software.ibm.com/icu4j.

Re: Unicode string problem.

Posted By:   Jorgen_Nordqvist  
Posted On:   Thursday, August 29, 2002 03:50 PM

The easiest thing is probably to view the characters in a Web browser. Write a small servlet that sets the charset HTTP header to "UTF-8" then writes out the unicode characters/string. Make sure your browsers encoding settings are also set to UTF-8 and you will be able to view the correct character.

Jorgen

Re: Unicode string problem.

Posted By:   Rob_Eamon  
Posted On:   Wednesday, August 28, 2002 08:25 PM

Here's a function that will convert unicode literals, and escaped characters, to the char equivalents.

/**
* Converts Java and Unicode character escape literals to
* their character equivalent.
*
* Various services may require the use of special
* characters that cannot normally be typed at the
* keyboard, such as a newline or carriage return. These
* characters can be supported through the use of escape
* sequences. These escape sequences are interpreted and
* converted to their character equivalent.
*
* The following escaped characters can be specified as
* the input variable and will be converted as described:
*
*  - Backspace
* - Horizontal tab
*
- Newline
* f - Form feed
*
- Carriage return
* uXXXX - The Unicode character with encoding XXXX,
* where XXXX is four hexadecimal digits. For example,
* u000a represents a newline character.
*
* If the input variable unitCode contains any other
* values, the first character is returned. If unitCode is
* null or empty, the returned character is 0 (nul - not
* the character '0').
*/
public static char toChar(String unitCode)
{
char theChar = 0;
if(unitCode != null)
{
unitCode = unitCode.trim();
if(unitCode.length() > 0)
{
theChar = unitCode.charAt(0); // default to return 1st character
if(unitCode.charAt(0) == '\')
{
if(unitCode.length() == 1)
{
theChar = '\';
}
else if(unitCode.length() == 2)
{
switch(unitCode.charAt(1))
{
case 'r':
theChar = '
';
break;
case 'n':
theChar = '
';
break;
case 'b':
theChar = '';
break;
case 't':
theChar = ' ';
break;
case 'f':
theChar = 'f';
break;
}
}
else if((unitCode.charAt(1) == 'u') && (unitCode.length() >= 6))
{
theChar = (char) Integer.parseInt(unitCode.substring(2, 6), 16);
}
}
}
return theChar;
}

HTH
About | Sitemap | Contact