Where can I find information on UTF and conversions from one character set to another?
Created Aug 26, 2001
Joe Sam Shirah UTF stands for a Unicode Transformation Format, which "is an algorithmic mapping from every Unicode scalar value to a unique byte sequence." UTF-8 is probably the most common transformation. For information, see:
- UTF & BOM
- RFC 2279: UTF-8, a transformation format of ISO 10646
- Unicode Transformation Formats: UTF-8 & Co.
- UTF-8 and Unicode Standards
- UTF-8 and Unicode FAQ for Unix/Linux
- How do I convert a String from Unicode to another encoding and vice versa?
- If an HTML form encoded in UTF-8 is presented...
- How can I store and retrieve Unicode or Double Byte data in a file using the java.io libraries?
- How do I set a default character encoding for file I/O operations, JDBC requests and so on?