i don't understand the lines in the loop : we take the character and subtract a
, so "10" ? (why?)
then 1 << val
: we shift 1 by val ? (why?)
and checker is 0, so how do we reach > 0
in the condition?
public static boolean isUniqueChars(String str) {
int checker = 0;
for (int i = 0; i < str.length(); i++) {
int val = str.charAt(i) - 'a';
if ((checker & (1 << val)) > 0) return false;
checker |= (1 << val);
}
return true;
}
Thanks
The code assumes that str
is made of lower case letters and returns true if there are no repeating letters. It works like this:
checker
is used as a bitmap, that is, each bit in this variable is used to keep track of one letter. Subtracting 'a' from the character gives 0 for 'a', 1 for 'b', 2 for 'c' etc. Shifting 1 left by this number gives 1 for 'a', 2 for 'b', 4 for 'c' etc.
By oring (|
) this value with checker
the code keeps track of previously encountered characters. So, if we encounter a second 'a' for example, checker
has it's first bit set (which is tested by &
in the if statement), so we know that str
has duplicates.
In short, checker
is used as a faster and more compact array of bool. This and similar techniques are called bit manipulation.
str.charAt(i) - 'a'
returns, more or less, the "letter in the alphabet": if str.charAt(i)
is 'a'
, this is 0
, if it's 'b'
, val
is 1
, if it's z
, val
is 25
, and so on.
The rest of this is using a bit-twiddling technique: if the val
th bit of checker
is 1
, then we've seen the val
th character already. So checker & (1 << val)
is nonzero if and only if the val
th bit of checker
is set, meaning we've already seen that character. Otherwise, we set the val
th bit of checker
and continue the loop.
The question is quite old but I think my answer is correct and clearer.
We are dealing with lower case letter only in this case.
Like Paul asked, (1 << val)
means Shift 1 by 'val' not shift 'val' by one. So for example, 'hello' gives:
1st loop: val = 111 in binary, 7 in decimal
1 << val = 10000000 <-- here one is shifted by 7
checker = 10000000
2nd loop: val = 100 in binary, 4 in decimal
1 << val = 10000
checker = 10010000
3rd loop: val = 1011 in binary, 11 in decimal
1 << val = 100000000000
checker = 100010010000
4th loop: val = 1011 in binary, 11 in decimal
1 << val = 100000000000
(checker & (1 << val)) gives 100000000000 which is greater than 0
The loop is over because 'l' is not unique after 'hel' in 'hello'.
I hope this clears things.
I have the source code if you'd like to see exactly what's happening.
public class one_one_one {
public static void main(String[] args) {
String chars = "hello";
if (isUniqueChars(chars)) {
System.out.println("Unique");
} else {
System.out.println("Not Unique");
}
}
public static boolean isUniqueChars(String str) {
int checker = 0;
for (int i = 0; i < str.length(); i++) {
int val = str.charAt(i) - 'a';
System.out.println(Integer.toBinaryString(val)); //debug printout
int temp = 1 << val;
System.out.println(Integer.toBinaryString(temp)); //debug printout
System.out.println(checker & (1 << val)); //debug printout
if ((checker & (1 << val)) > 0) {
return false;
}
checker |= (1 << val);
System.out.println(Integer.toBinaryString(checker)); //debug printout
System.out.println(' '); //debug printout
}
return true;
}
}