While investigating for a little debate w.r.t. using "" + n
and Integer.toString(int)
to convert an integer primitive to a string I wrote this JMH microbenchmark:
@Fork(1)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class IntStr {
protected int counter;
@GenerateMicroBenchmark
public String integerToString() {
return Integer.toString(this.counter++);
}
@GenerateMicroBenchmark
public String stringBuilder0() {
return new StringBuilder().append(this.counter++).toString();
}
@GenerateMicroBenchmark
public String stringBuilder1() {
return new StringBuilder().append("").append(this.counter++).toString();
}
@GenerateMicroBenchmark
public String stringBuilder2() {
return new StringBuilder().append("").append(Integer.toString(this.counter++)).toString();
}
@GenerateMicroBenchmark
public String stringFormat() {
return String.format("%d", this.counter++);
}
@Setup(Level.Iteration)
public void prepareIteration() {
this.counter = 0;
}
}
I ran it with the default JMH options with both Java VMs that exist on my Linux machine (up-to-date Mageia 4 64-bit, Intel i7-3770 CPU, 32GB RAM). The first JVM was the one supplied with Oracle JDK 8u5 64-bit:
java version "1.8.0_05"
Java(TM) SE Runtime Environment (build 1.8.0_05-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.5-b02, mixed mode)
With this JVM I got pretty much what I expected:
Benchmark Mode Samples Mean Mean error Units
b.IntStr.integerToString thrpt 20 32317.048 698.703 ops/ms
b.IntStr.stringBuilder0 thrpt 20 28129.499 421.520 ops/ms
b.IntStr.stringBuilder1 thrpt 20 28106.692 1117.958 ops/ms
b.IntStr.stringBuilder2 thrpt 20 20066.939 1052.937 ops/ms
b.IntStr.stringFormat thrpt 20 2346.452 37.422 ops/ms
I.e. using the StringBuilder
class is slower due to the additional overhead of creating the StringBuilder
object and appending an empty string. Using String.format(String, ...)
is even slower, by an order of magnitude or so.
The distribution-provided compiler, on the other hand, is based on OpenJDK 1.7:
java version "1.7.0_55"
OpenJDK Runtime Environment (mageia-2.4.7.1.mga4-x86_64 u55-b13)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
The results here were interesting:
Benchmark Mode Samples Mean Mean error Units
b.IntStr.integerToString thrpt 20 31249.306 881.125 ops/ms
b.IntStr.stringBuilder0 thrpt 20 39486.857 663.766 ops/ms
b.IntStr.stringBuilder1 thrpt 20 41072.058 484.353 ops/ms
b.IntStr.stringBuilder2 thrpt 20 20513.913 466.130 ops/ms
b.IntStr.stringFormat thrpt 20 2068.471 44.964 ops/ms
Why does StringBuilder.append(int)
appear so much faster with this JVM? Looking at the StringBuilder
class source code revealed nothing particularly interesting - the method in question is almost identical to Integer#toString(int)
. Interestingly enough, appending the result of Integer.toString(int)
(the stringBuilder2
microbenchmark) does not appear to be faster.
Is this performance discrepancy an issue with the testing harness? Or does my OpenJDK JVM contain optimizations that would affect this particular code (anti)-pattern?
EDIT:
For a more straight-forward comparison, I installed Oracle JDK 1.7u55:
java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)
The results are similar to those of OpenJDK:
Benchmark Mode Samples Mean Mean error Units
b.IntStr.integerToString thrpt 20 32502.493 501.928 ops/ms
b.IntStr.stringBuilder0 thrpt 20 39592.174 428.967 ops/ms
b.IntStr.stringBuilder1 thrpt 20 40978.633 544.236 ops/ms
It seems that this is a more general Java 7 vs Java 8 issue. Perhaps Java 7 had more aggressive string optimizations?
EDIT 2:
For completeness, here are the string-related VM options for both of these JVMs:
For Oracle JDK 8u5:
$ /usr/java/default/bin/java -XX:+PrintFlagsFinal 2>/dev/null | grep String
bool OptimizeStringConcat = true {C2 product}
intx PerfMaxStringConstLength = 1024 {product}
bool PrintStringTableStatistics = false {product}
uintx StringTableSize = 60013 {product}
For OpenJDK 1.7:
$ java -XX:+PrintFlagsFinal 2>/dev/null | grep String
bool OptimizeStringConcat = true {C2 product}
intx PerfMaxStringConstLength = 1024 {product}
bool PrintStringTableStatistics = false {product}
uintx StringTableSize = 60013 {product}
bool UseStringCache = false {product}
The UseStringCache
option was removed in Java 8 with no replacement, so I doubt that makes any difference. The rest of the options appear to have the same settings.
EDIT 3:
A side-by-side comparison of the source code of the AbstractStringBuilder
, StringBuilder
and Integer
classes from the src.zip
file of reveals nothing noteworty. Apart from a whole lot of cosmetic and documentation changes, Integer
now has some support for unsigned integers and StringBuilder
has been slightly refactored to share more code with StringBuffer
. None of these changes seem to affect the code paths used by StringBuilder#append(int)
, although I may have missed something.
A comparison of the assembly code generated for IntStr#integerToString()
and IntStr#stringBuilder0()
is far more interesting. The basic layout of the code generated for IntStr#integerToString()
was similar for both JVMs, although Oracle JDK 8u5 seemed to be more aggressive w.r.t. inlining some calls within the Integer#toString(int)
code. There was a clear correspondence with the Java source code, even for someone with minimal assembly experience.
The assembly code for IntStr#stringBuilder0()
, however, was radically different. The code generated by Oracle JDK 8u5 was once again directly related to the Java source code - I could easily recognise the same layout. On the contrary, the code generated by OpenJDK 7 was almost unrecognisable to the untrained eye (like mine). The new StringBuilder()
call was seemingly removed, as was the creation of the array in the StringBuilder
constructor. Additionaly, the disassembler plugin was not able to provide as many references to the source code as it did in JDK 8.
I assume that this is either the result of a much more aggressive optimization pass in OpenJDK 7, or more probably the result of inserting hand-written low-level code for certain StringBuilder
operations. I am unsure why this optimization does not happen in my JVM 8 implementation or why the same optimizations were not implemented for Integer#toString(int)
in JVM 7. I guess someone familiar with the related parts of the JRE source code would have to answer these questions...