dcsimg
Use ANTLR to process Chinese Characters
1 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Vivian_Fonger
Posted On:   Wednesday, February 6, 2002 02:38 PM

I am having problem to use ANTLR to parse Chinese (Traditional) Characters.

The Chinese Characters are in UTF 8 format.

When I use the .getText() method from antlr.Token class to get the next token, it only returns the high byte of my Chinese Character first and then the low byte when I call it again

I would like to know how to use .getText method to read Chinese Characters in UTF 8 format as a whole.

Thanks.

Re: Use ANTLR to process Chinese Characters

Posted By:   Terence_Parr  
Posted On:   Monday, February 11, 2002 05:41 PM

That sounds like your input stream is not decoding the UTF-8. Are you using DataInputStream or a Reader?
About | Sitemap | Contact