dcsimg
Java:Unicode!_NEWLINE_!Anyone can clear my doubt regarding unicode and bytecode?!_NEWLINE_!I am in uregent need of help.
1 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   S_Shreya
Posted On:   Tuesday, December 25, 2001 01:36 AM

Hello there, I am confused about unicode and bytecode in java. I am unable to understand why in mentioning data types, only char data type is mentioned as 16-bit unicode. What about int, long, byte, float etc.,? Is there any relation to unicode and bytecode? Actually, the bytecode which is an intermediate code, how it will be stored i.e., in what format? why java is compiled and interpreted? As interpretation is slower than compilation, why can't it be complied again instead of interpretation? How just-in-time compilers work and why they are not so popular? How ASCII is a subset of unicode? The java books that are available in the market are usin   More>>



Hello there,
I am confused about unicode and bytecode in java.
I am unable to understand why in mentioning data types, only
char data type is mentioned as 16-bit unicode. What about int,
long, byte, float etc.,? Is there any relation to unicode and bytecode?
Actually, the bytecode which is an intermediate code, how it will be stored i.e., in
what format?



why java is compiled and interpreted? As interpretation is
slower than compilation, why can't it be complied again instead
of interpretation? How just-in-time compilers work and why they are
not so popular? How ASCII is a subset of unicode? The java books
that are available in the market are using which code? ASCII or UNICODE?
Thank you!
Girija

   <<Less

Re: Java:Unicode!_NEWLINE_!Anyone can clear my doubt regarding unicode and bytecode?!_NEWLINE_!I am in uregent need of help.

Posted By:   Christopher_Schultz  
Posted On:   Wednesday, December 26, 2001 07:29 AM

Boy, that's a ton of questions:




  1. Unicode is only mentioned when talking about the char datatype (instead of int, long, byte, float, etc.) because UNICODE is a character encoding for text. Since we rarely talk about 'int' in a textual context, we don't need to consider these data types when talking about UNICODE. UNICODE specifies how to express characters in terms of bytes. In the old days, we used EBCIDIC, and then ASCII. ASCII uses 256 unique characters which all take one byte a piece to represent. UNICODE takes two bytes per character, and can express 2^16 characters (that's 65536 unique characters).



  2. UNICODE and bytecode have nothing to do with one another.



  3. The format of bytecode can be read in the Java Language Specification available at java.sun.com.



  4. Java is compiled to bytecode and interpreted so that you can compile it in one place and run it in another. This is in contrast to, say, a C program, which must be compiled each time it moves environments. Sometimes these programs can be shared across computers (like those running Win32), and sometimes they must be recompiled and linked against differing libraries (often the case in UNIX).



  5. Interpretation is slower than compilation. However, the designers of the language felt that the ability for bytecode to be executed on any machine with a VM outweighed the performance degradation from interpretation of bytecode. Don't forget that your microprocessor 'interprets' instructions in your normal EXE files, so at some point, everything is being interpreted.



  6. There's nothing stopping you from compiling your Java program to native code. There are plenty of products available for this purpose. Not only that, but most decent VMs include a JIT which does this compilation on-the-fly for you.



  7. JITs are so popular because they close the performance gap opened by the interpreted nature of the Java language. They work by analyzing the bytecode at runtime and compiling your application piecemeal into native code so that it runs faster. This is the actual smarts of the Java VM. Bytecode is actually extremely easy to interpret. A skilled programmer could probably write a VM in a weekend that would handle most of the Java spec, if you ignore threading issues and synchronization. The real work in the VM goes into the JIT and memory management (garbage collection).



  8. ASCII is only a subset of UNICODE in that all the characters in ASCII are available in UNICODE. It turns out that any ASCII character is a UNICODE character with the first byte of zero.



  9. The Java books that are available in the market are using Java code. Forget about UNICODE versus ASCII. You can't even to ASCII in Java even if you tried. The closest you can come is UTF-8, which is a half-breed of ASCII and UNICODE.




Hope that helps (a lot),

-chris
About | Sitemap | Contact