http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html, char size is 16 bit i.e 2 byte. somehow i recalled its 8 bit i.e 1 byte. To clear my doubt, i created a text file with single character "a" and saved it. Then i inspected the size of file , its 1 byte i.e 8 bit. I am confused whats the size of character ? If its 2 byte , why file size is 1 byte and if it is 1 byte why link says 2 bytes?
问题:
回答1:
A char
in Java is a UTF-16 code unit. It's not necessarily a complete Unicode character, but it's effectively an unsigned 16-bit integer.
When you write text to a file (or in some other way convert it into a sequence of bytes), then the data will depend on which encoding you use. For example, if you use ASCII or ISO-8859-1 then you're very limited as to which characters you can write, but each character will only be a byte. If you use UTF-16, then each Java char
will be converted into exactly two bytes - but some Unicode characters may take four bytes (those represented by two Java char
values).
If you use UTF-8, then the length of even a single Java char
in the encoded form will depend on the value.
回答2:
There is a contemporary way to learn its size. Just print with BYTES
.
System.out.println(Character.BYTES);
It results in 2
回答3:
Note that text files really have a format/ character set associated with them. Text files will normally be saved in UTF-8 format which is 8 bits per character unless the character is "special".
回答4:
A char in Java is 2 bytes large (as the valid value range suggests). But it doesn't necessarily mean that every representation of a character is 2 bytes long. For instance, many encodings would only reserve 1 byte for every character (or use 1 byte for the most frequent characters).If the platform default encoding is a 1-byte encoding such as ISO-8859-1 or a variable-length encoding such as UTF-8, it can easily convert that 1 byte to a single character.