As written in JEP 280:
Change the static
String
-concatenation bytecode sequence generated byjavac
to useinvokedynamic
calls to JDK library functions. This will enable future optimizations ofString
concatenation without requiring further changes to the bytecode emmited byjavac
.
Here I want to understand what the use of invokedynamic
calls is and how bytecode concatenation is different from invokedynamic
?
The "old" way output a bunch of
StringBuilder
-oriented operations. Consider this program:If we compile that with JDK 8 or earlier and then use
javap -c Example
to see the bytecode, we see something like this:As you can see, it creates a
StringBuilder
and usesappend
. This is famous fairly inefficient as the default capacity of the built-in buffer inStringBuilder
is only 16 chars, and there's no way for the compiler to know to allocate more in advance, so it ends up having to reallocate. It's also a bunch of method calls. (Note that the JVM can sometimes detect and rewrite these patterns of calls to make them more efficient, though.)Let's look at what Java 9 generates:
Oh my but that's shorter. :-) It makes a single call to
makeConcatWithConstants
fromStringConcatFactory
, which says this in its Javadoc:I'll slightly add a bit of details here. The main part to get is that how string concatenation is done is a runtime decision, not a compile time one anymore. Thus it can change, meaning that you have compiled your code once against java-9 and it can change the underlying implementation however it pleases, without the need to re-compile.
And the second point is that at the moment there are
6 possible strategies for concatenation of String
:You can choose any of them via a parameter :
-Djava.lang.invoke.stringConcat
. Notice thatStringBuilder
is still an option.Before going into the details of the
invokedynamic
implementation used for optimisation of String concatenation, in my opinion, one must get some background over What's invokedynamic and how do I use it?I would probably try and take you through these with the changes that were brought along for the implementation of String concatenation optimisation.
Defining the Bootstrap Method:- With Java9, the bootstrap methods for
invokedynamic
call sites, to support the string concatenation primarilymakeConcat
andmakeConcatWithConstants
were introduced with theStringConcatFactory
implementation.The use of invokedynamic provides an alternative to select a translation strategy until runtime. The translation strategy used in
StringConcatFactory
is similar to theLambdaMetafactory
as introduced in the previous java version. Additionally one of the goals of the JEP mentioned in the question is to stretch these strategies further.Specifying Constant Pool Entries:- These are the additional static arguments to the
invokedynamic
instruction other than (1)MethodHandles.Lookup
object which is a factory for creating method handles in the context of theinvokedynamic
instruction,(2) aString
object, the method name mentioned in the dynamic call site and (3) theMethodType
object, the resolved type signature of the dynamic call site.There are already linked during the linkage of the code. At runtime, the bootstrap method runs and links in the actual code doing the concatenation. It rewrites the
invokedynamic
call with an appropriateinvokestatic
call. This loads the constant string from the constant pool, the bootstrap method static args are leveraged to pass these and other constants straight to the bootstrap method call.Using the invokedynamic Instruction:- This offers the facilities for a lazy linkage, by providing the means to bootstrap the call target once, during the initial invocation. The concrete idea for optimisation here is to replace the entire
StringBuilder.append
dance with a simpleinvokedynamic
call tojava.lang.invoke.StringConcatFactory
, that will accept the values in the need of concatenation.The Indify String Concatenation proposal states with an example the benchmarking of the application with Java9 where a similar method as shared by @T.J. Crowder is compiled and the difference in the bytecode is fairly visible between the varying implementation.