When “” == s is false but “”.equals( s ) is true

2020-01-27 10:01发布

问题:

EDIT Thanks for the prompt responses. Please see what the real question is. I have made it bold this time.

I do understand the difference between == and .equals. So, that's not my question (I actually added some context for that)


I'm performing the validation below for empty strings:

if( "" == value ) { 
    // is empty string 
} 

In the past when fetching values from the db or deserializing objects from another node, this test failed, because the two string instances were indeed different object references, albeit they contained the same data.

So the fix for those situations was

if( "".equals( value ) ) {
   // which returns true for all the empty strings
}

I'm fine with that. That's clearly understood.

Today this happened once again, but it puzzled me because this time the application is a very small standalone application that doesn't use network at all, so no new string is fetched from the database nor deserizalized from another node.

So the question is:


Under which OTHER circumstances:

"" == value // yields false 

and

"".equals( value ) // yields true

For a local standalone application?

I'm pretty sure new String() is not being used in the code.

And the only way a string reference could be "" is because it is being assigned "" directly in the code (or that's what I thought) like in:

String a = "";
String b = a;

assert "" == b ; // this is true 

Somehow (after reading the code more I have a clue) two different empty string object references were created, I would like to know how

More in the line of jjnguys answer:

Byte!

EDIT: Conclusion

I've found the reason.

After jjnguy suggestion I was able to look with different eyes to the code.

The guilty method: StringBuilder.toString()

A new String object is allocated and initialized to contain the character sequence currently represented by this object.

Doh!...

    StringBuilder b = new StringBuilder("h");
    b.deleteCharAt( 0 );
    System.out.println( "" == b.toString() ); // prints false

Mystery solved.

The code uses StringBuilder to deal with an ever growing string. It turns out that at some point somebody did:

 public void someAction( String string ) { 
      if( "" == string ) {
           return;
       }

       deleteBankAccount( string );
 }

and use

 someAction( myBuilder.toString() ); // bug introduced. 

p.s. Have I read too much CodingHorror lately? Or why do I feel the need to add some funny animal pictures here?

回答1:

String s = "";
String s2 = someUserInputVariale.toLowercase(); // where the user entered in ""

Something like that would cause s == s2 to evaluate to false.

Lots of code sill create new Strings without exposing the call to new String().



回答2:

"" == value // yields false

and

"".equals( value ) // yields true

any time the value of the variable value has not been interned. This will be the case if the value is computed at run time. See the JLS section 3.10.5 String Literals for example code illustrating this:

Thus, the test program consisting of the compilation unit (§7.3):

package testPackage;
class Test {
    public static void main(String[] args) {
        String hello = "Hello", lo = "lo";
        System.out.print((hello == "Hello") + " ");
        System.out.print((Other.hello == hello) + " ");
        System.out.print((other.Other.hello == hello) + " ");
        System.out.print((hello == ("Hel"+"lo")) + " ");
        System.out.print((hello == ("Hel"+lo)) + " ");
        System.out.println(hello == ("Hel"+lo).intern());
    }
}
class Other { static String hello = "Hello"; }

and the compilation unit:

package other;
public class Other { static String hello = "Hello"; }

produces the output:

true true true true false true

This example illustrates six points:

  • Literal strings within the same class (§8) in the same package (§7) represent references to the same String object (§4.3.1).
  • Literal strings within different classes in the same package represent references to the same String object.
  • Literal strings within different classes in different packages likewise represent references to the same String object.
  • Strings computed by constant expressions (§15.28) are computed at compile time and then treated as if they were literals.
  • Strings computed at run time are newly created and therefore distinct.
  • The result of explicitly interning a computed string is the same string as any pre-existing literal string with the same contents.


回答3:

If you can grab a hold of the book Java Puzzlers by Joshua Bloch and Neal Gafter, and look at puzzle 13, "Animal Farm"... he has great advice on this issue. I am going to copy some relevant text:

"You may be aware that compile-time constants of type String are interned [JLS 15.28]. In other words any two constant expressions of type String that designate the same character sequence are represented by identical object references... Your code should rarely, if ever, depend on the interning of string constants. Interning was designed solely to reduce the memory footprint of the virtual machine, not as a tool for programmers... When comparing object references, you should use the equals method in preference to the == operator unless you need to compare object identity rather than value."

That's from the above reference I mentioned... pages 30 - 31 in my book.



回答4:

Would you expect "abcde".substring(1,2) and "zbcdefgh".substring(1,2) to yield the same String object?

They both yield "equal" sub-strings extracted from two different Strings, but it seems quite reasonable that tehy are different objects, so == sees them as different.

Now consider when the substring has length 0, substring(1, 1). It yields a zero length String, but it's not surprising that the "abcde".substring(1,1) is a different object from "zbcdefgh".substring(1,2) and hence at least one of them is a different object from "".



回答5:

As I understand it while compiling the Java code to bytecode or while running the program same strings will be referenced to the same object in the most cases to save memory. So sometimes you get away with == comparisons of strings. But this is a compiler optimization you can not rely on.

But then sometimes it happens that the compiler decides to not do this optimization or there is no way for the program to see that the strings are the same and out of the sudden the check fails since you are relying on some underlying optimization voodoo that depends on implementation of the jvm you are using and so on.

So using equals is always the good thing to do. For empty strings there are other possibilities like comparing with length == 0 or if you don't care about backwards compatibility there is string.empty().



回答6:

You should try considering String.length() == 0.



回答7:

Check this reference: http://mindprod.com/jgloss/string.html#COMPARISON at the excellent Canadian Mind Products Java & Internet Glossary. Worth a bookmark.



回答8:

The javadoc for String.intern() has some good commentary on == vs. .equals().

The documentation also clarifies that every string literal is intern'd.

public String intern()

Returns a canonical representation for the string object.

A pool of strings, initially empty, is maintained privately by the class String.

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.

All literal strings and string-valued constant expressions are interned. String literals are defined in §3.10.5 of the Java Language Specification

Returns: a string that has the same contents as this string, but is guaranteed to be from a pool of unique strings.



回答9:

If you use google code search, you can find lots of places where people make this same error: google for file:.java \=\=\ \"\" Of course, this can be a correct idiom in carefully controlled circumstances, but usually, its just a bug.



回答10:

Why not use:

if (value != null && value.length == 0) {
    // do stuff (above could be "== null ||"
}

You should use equals() because == for objects compares references, i.e., are they the same object. Whilst at compile time Java finds identical strings and makes them share the same reference (Strings are immutable), at runtime it is easy to create empty strings that have different references, where == fails for your typical intention of equals().