When using the code below
int count = 0;
while(count != '♂'){
count++;
}
System.out.print(count);
in the interactions pane in dr java, it equals the correct unicode number (9794), but when done in the programming pane , it returns 63, using the exact same code.
Why is that ?
This is getting complicated,
when using this code in the interactions pane
import java.util.Scanner;
Scanner scan = new Scanner(System.in);
int count = 0;
String letter = scan.next();
input is ♂
while(count != letter.charAt(0)){
count++;
}
it returns 66, and same in the programming pane,
Character encoding depends on the system. Depending on your box, it may print the 16 bits of the UTF-16 encoding (which is 9794), the high 8 bits (which would be 38) or, as in your case, the low 8 bits (which is 66).
- 9794 / 256 = 38
- 9794 % 256 = 66
It appears to be a problem with the tools you're using, which is probably a good reason to stop using them :-)
The male symbol, U+2642
(see here) has 0x42 as the final octet (from 0x26 0x42
) which, in decimal, is 66
(4 x 16 + 2
).
when done in the programming pane , it returns 63, using the exact same code.
Character 63 is the question mark ?
. This is used as a placeholder for when a character you have used cannot be written in the encoding you are writing to.
I'm assuming by “programming pane” you mean the main source code editor. This content is saved to a file before being compiled, so it has to be saved using a particular encoding. Unfortunately it looks like DrJava doesn't give you the option to save files as anything other than the system default code page, probably in your case Windows code page 1252. This encoding does not include the character ♂
, so it gets replaced by ?
.
So it looks like with your current workflow you will sadly have to stick to ASCII-only source, and consequently say:
while (count != '\u2642') { ...