I've come across a class that includes multiple uses of a string literal, "foo".
What I'd like to know, is what are the benefits and impact (in terms of object creation, memory usage and speed) of using this approach instead of declaring the String as final and replacing all the literals with the final variable?
For example (although obviously not a real word usage):
private static final String FINAL_STRING = "foo";
public void stringPrinter(){
for(int i=0;i<10;i++){
System.out.println(FINAL_STRING);
}
}
Versus:
public void stringPrinter(){
for(int i=0;i<10;i++){
System.out.println("foo");
}
}
Which is preferable and why (assuming the string value will remain constant)?
Would the above (second) example result in 10 String objects being created or would the JVM realise that only a single literal is actually used, and create a single reference. If so, is there any advantage for declaring the String as final (as in the first example)?
If the interpreted code does replace the string literal with a single reference, does that still apply if the same literal occurs in more than one place:
public void stringPrinter(){
for(int i=0;i<5;i++){
System.out.println("foo"); // first occurence
System.out.println("foo"); // second occurence
}
}
They will be exactly the same. The literal is interned (any compile time constant expression that results in that string shares the same instance as all other constants/literals) in both cases and a smart compiler+runtime should have no trouble reducing both to the most optimized example.
The advantage comes more in maintainability. If you want to change the literal, you would need only change one occurrence with a constant but you would need to search and change every instance if they were included inline.
From the JLS
Compile-time constants of type String are always "interned" so as to share unique instances, using the method String.intern.
So, no, there's gonna be only one string object.
As Mark notes, this is strictly the question of maintainability and not performance.
The advantage is not in performance, but in maintainability and reliability.
Let me take a real example I came across just recently. A programmer created a function that took a String parameter that identified the type of a transaction. Then in the program he did string compares against this type. Like:
if (type.equals("stock"))
{ ... do whatever ... }
Then he called this function, passing it the value "Stock".
Do you notice the difference in capitalization? Neither did the original programmer. It proved to be a fairly subtle bug to figure out, because even looking at both listings, the difference in capitalization didn't strike me.
If instead he had declared a final static, say
final static String stock="stock";
Then the first time he tried to pass in "Stock" instead of "stock", he would have gotten a compile-time error.
Better still in this example would have been to make an enum, but let's assume he actually had to write the string to an output file or something so it had to be a string.
Using final statics gives at least x advantages:
(1) If you mis-spell it, you get a compile-time error, rather than a possibly-subtle run-time error.
(2) A static can assign a meaingful name to a value. Which is more comprehensible:
if (employeeType.equals("R")) ...
or
if (employeeType.equals(EmployeeType.RETIRED)) ...
(3) When there are multiple related values, you can put a group of final statics together at the top of the program, thus informing future readers what all the possible values are. I've had plenty of times when I've seen a function compare a value against two or three literals. And that leaves me wondering: Are there other possible values, or is this it? (Better still is often to have an enum, but that's another story.)
All String literals are kept in a String cache (this is across all classes)
Using a constant can make the code clearer, give the the string some context and make the code easier to maintain esp if the same string appears in multiple places.
Those string literals are internalized, so no new String objects are created in the loop. Using the same literal twice could still be a sign for code smell, though; but not in terms of speed or memory usage.
In the cases you are providing, I believe the biggest reason for having it declared as FINAL_STRING
somewhere is to ensure it stays in one centralized location. There will only ever be one instance of that string constant, but the first example is far easier to maintain.