Java Character literals value with getNumericValue

2019-02-17 18:47发布

问题:

Why do I get the same results for both upper- and lowercase literals? For instance:

char ch1 = 'A';
char ch2 = 'a';
char ch3 = 'Z';
char ch4 = 'z';

print("ch1 -- > " + Integer.toBinaryString(Character.getNumericValue(ch1)));
print("ch2 -- > " + Integer.toBinaryString(Character.getNumericValue(ch2)));
print("ch3 -- > " + Integer.toBinaryString(Character.getNumericValue(ch3)));
print("ch4 -- > " + Integer.toBinaryString(Character.getNumericValue(ch4)));

As results I get:

ch1 -- > 1010
ch2 -- > 1010
ch3 -- > 100011
ch4 -- > 100011

And don't really see the difference between 'A' and 'a'. Even if I use character literals in UTF form (\u0041 for 'A' and \u0061 for 'a') I do get the same results.

回答1:

It's behaving exactly as documented:

The letters A-Z in their uppercase ('\u0041' through '\u005A'), lowercase ('\u0061' through '\u007A'), and full width variant ('\uFF21' through '\uFF3A' and '\uFF41' through '\uFF5A') forms have numeric values from 10 through 35.

Basically this means that when parsing hex (say), 0xfa == 0xFA, as you'd expect.

I'd only expect case to matter when using something like base64.



回答2:

Judging from the commentary, you're actually looking for the codepoints of the characters, rather than their numeric value, so I'll just isolate that into an answer. The getNumericValue() function returns what the character means as a number when interpreting its glyph, it does not return the codepoint of a character. For instance, getNumericValue('5') returns 5 as an int, not the codepoint of 5.

To use the codepoints, just use your variables or the char literals as they are. char is a numeric datatype. For instance, System.out.println((int)'a'); will print 65, quite simply.