EDIT
Thanks for the prompt responses. Please see what the real question is. I have made it bold this time.
I do understand the difference between == and .equals. So, that's not my question (I actually added some context for that)
I'm performing the validation below for empty strings:
if( "" == value ) {
// is empty string
}
In the past when fetching values from the db or deserializing objects from another node, this test failed, because the two string instances were indeed different object references, albeit they contained the same data.
So the fix for those situations was
if( "".equals( value ) ) {
// which returns true for all the empty strings
}
I'm fine with that. That's clearly understood.
Today this happened once again, but it puzzled me because this time the application is a very small standalone application that doesn't use network at all, so no new string is fetched from the database nor deserizalized from another node.
So the question is:
Under which OTHER circumstances:
"" == value // yields false
and
"".equals( value ) // yields true
For a local standalone application?
I'm pretty sure new String() is not being used in the code.
And the only way a string reference could be "" is because it is being assigned "" directly in the code (or that's what I thought) like in:
String a = "";
String b = a;
assert "" == b ; // this is true
Somehow (after reading the code more I have a clue) two different empty string object references were created, I would like to know how
More in the line of jjnguys answer:
Byte!
EDIT: Conclusion
I've found the reason.
After jjnguy suggestion I was able to look with different eyes to the code.
The guilty method: StringBuilder.toString()
A new String object is allocated and initialized to contain the character sequence currently represented by this object.
Doh!...
StringBuilder b = new StringBuilder("h");
b.deleteCharAt( 0 );
System.out.println( "" == b.toString() ); // prints false
Mystery solved.
The code uses StringBuilder to deal with an ever growing string. It turns out that at some point somebody did:
public void someAction( String string ) {
if( "" == string ) {
return;
}
deleteBankAccount( string );
}
and use
someAction( myBuilder.toString() ); // bug introduced.
p.s. Have I read too much CodingHorror lately? Or why do I feel the need to add some funny animal pictures here?
String s = "";
String s2 = someUserInputVariale.toLowercase(); // where the user entered in ""
Something like that would cause s == s2
to evaluate to false.
Lots of code sill create new Strings
without exposing the call to new String()
.
"" == value // yields false
and
"".equals( value ) // yields true
any time the value of the variable value
has not been interned. This will be the case if the value is computed at run time. See the JLS section 3.10.5 String Literals for example code illustrating this:
Thus, the test program consisting of the compilation unit (§7.3):
package testPackage;
class Test {
public static void main(String[] args) {
String hello = "Hello", lo = "lo";
System.out.print((hello == "Hello") + " ");
System.out.print((Other.hello == hello) + " ");
System.out.print((other.Other.hello == hello) + " ");
System.out.print((hello == ("Hel"+"lo")) + " ");
System.out.print((hello == ("Hel"+lo)) + " ");
System.out.println(hello == ("Hel"+lo).intern());
}
}
class Other { static String hello = "Hello"; }
and the compilation unit:
package other;
public class Other { static String hello = "Hello"; }
produces the output:
true true true true false true
This example illustrates six points:
- Literal strings within the same class (§8) in the same package (§7) represent references to the same String object (§4.3.1).
- Literal strings within different classes in the same package represent references to the same String object.
- Literal strings within different classes in different packages likewise represent references to the same String object.
- Strings computed by constant expressions (§15.28) are computed at compile time and then treated as if they were literals.
- Strings computed at run time are newly created and therefore distinct.
- The result of explicitly interning a computed string is the same string as any pre-existing literal string with the same contents.
If you can grab a hold of the book Java Puzzlers by Joshua Bloch and Neal Gafter, and look at puzzle 13, "Animal Farm"... he has great advice on this issue. I am going to copy some relevant text:
"You may be aware that compile-time constants of type String
are interned [JLS 15.28]. In other words any two constant expressions of type String
that designate the same character sequence are represented by identical object references... Your code should rarely, if ever, depend on the interning of string constants. Interning was designed solely to reduce the memory footprint of the virtual machine, not as a tool for programmers... When comparing object references, you should use the equals
method in preference to the ==
operator unless you need to compare object identity rather than value."
That's from the above reference I mentioned... pages 30 - 31 in my book.
Would you expect "abcde".substring(1,2)
and "zbcdefgh".substring(1,2)
to yield the same String object?
They both yield "equal" sub-strings extracted from two different Strings, but it seems quite reasonable that tehy are different objects, so == sees them as different.
Now consider when the substring has length 0, substring(1, 1)
. It yields a zero length String, but it's not surprising that the "abcde".substring(1,1)
is a different object from "zbcdefgh".substring(1,2)
and hence at least one of them is a different object from "".
As I understand it while compiling the Java code to bytecode or while running the program same strings will be referenced to the same object in the most cases to save memory. So sometimes you get away with == comparisons of strings. But this is a compiler optimization you can not rely on.
But then sometimes it happens that the compiler decides to not do this optimization or there is no way for the program to see that the strings are the same and out of the sudden the check fails since you are relying on some underlying optimization voodoo that depends on implementation of the jvm you are using and so on.
So using equals is always the good thing to do. For empty strings there are other possibilities like comparing with length == 0 or if you don't care about backwards compatibility there is string.empty().
You should try considering String.length() == 0
.
Check this reference: http://mindprod.com/jgloss/string.html#COMPARISON at the excellent Canadian Mind Products Java & Internet Glossary. Worth a bookmark.
The javadoc for String.intern()
has some good commentary on ==
vs. .equals()
.
The documentation also clarifies that every string literal is intern
'd.
public String intern()
Returns a canonical representation for the string object.
A pool of strings, initially empty, is maintained privately by the class String.
When the intern method is invoked, if
the pool already contains a string
equal to this String object as
determined by the equals(Object)
method, then the string from the pool
is returned. Otherwise, this String
object is added to the pool and a
reference to this String object is
returned.
It follows that for any two strings s
and t, s.intern() == t.intern() is
true if and only if s.equals(t) is
true.
All literal strings and string-valued
constant expressions are interned.
String literals are defined in §3.10.5
of the Java Language Specification
Returns: a string that has the same
contents as this string, but is
guaranteed to be from a pool of unique
strings.
If you use google code search, you can find lots of places where people make this same error: google for file:.java \=\=\ \"\" Of course, this can be a correct idiom in carefully controlled circumstances, but usually, its just a bug.
Why not use:
if (value != null && value.length == 0) {
// do stuff (above could be "== null ||"
}
You should use equals()
because ==
for objects compares references, i.e., are they the same object. Whilst at compile time Java finds identical strings and makes them share the same reference (Strings are immutable), at runtime it is easy to create empty strings that have different references, where == fails for your typical intention of equals()
.