Using Dr.Java, why does ♂ equal something differen

2019-08-08 17:43发布

问题:

When using the code below

int count = 0;
while(count != '♂'){
    count++;
}
System.out.print(count);

in the interactions pane in dr java, it equals the correct unicode number (9794), but when done in the programming pane , it returns 63, using the exact same code.

Why is that ?

This is getting complicated,

when using this code in the interactions pane

import java.util.Scanner;
Scanner scan = new Scanner(System.in);
int count = 0;
String letter = scan.next();

input is ♂

while(count != letter.charAt(0)){
    count++;
}

it returns 66, and same in the programming pane,

回答1:

Character encoding depends on the system. Depending on your box, it may print the 16 bits of the UTF-16 encoding (which is 9794), the high 8 bits (which would be 38) or, as in your case, the low 8 bits (which is 66).

  • 9794 / 256 = 38
  • 9794 % 256 = 66


回答2:

It appears to be a problem with the tools you're using, which is probably a good reason to stop using them :-)

The male symbol, U+2642 (see here) has 0x42 as the final octet (from 0x26 0x42) which, in decimal, is 66 (4 x 16 + 2).



回答3:

when done in the programming pane , it returns 63, using the exact same code.

Character 63 is the question mark ?. This is used as a placeholder for when a character you have used cannot be written in the encoding you are writing to.

I'm assuming by “programming pane” you mean the main source code editor. This content is saved to a file before being compiled, so it has to be saved using a particular encoding. Unfortunately it looks like DrJava doesn't give you the option to save files as anything other than the system default code page, probably in your case Windows code page 1252. This encoding does not include the character , so it gets replaced by ?.

So it looks like with your current workflow you will sadly have to stick to ASCII-only source, and consequently say:

while (count != '\u2642') { ...