I was curious as to why Strings can be created without a call to new String()
, as the API mentions it is an Object
of class
java.lang.String
So how are we able to use String s="hi"
rather than String s=new String("hi")
?
This post clarified the use of ==
operator and absence of new
and says this is due to String
literals being interned or taken from a literal pool by the JVM
, hence Strings
are immutable.
On seeing a statement such as
String s="hi"
for the first time what really takes place ?
Does the
JVM
replace it like thisString s=new String("hi")
, wherein an Object is created and"hi"
is added to the String literal pool and so subsequent calls such asString s1="hi"
are taken from the pool?Is this how the underlying mechanism operates? If so, then is
String s=new String("Test"); String s1="Test";
the same as
String s="Test"; String s1="Test";
in terms of memory utilization and efficiency?
Also, is there any way by which we can access the String Pool to check how many
String
literals are present in it, how much space is occupied, etc.?
The following is a slight simplification, so don't try to cite exact details from it, but the general principles apply.
Each compiled Java class contains a data blob that indicates how many strings were declared in that class file, how long each one is, and the characters that belong in all of them. When the class is loaded, the class loader will create a
String[]
of suitable size to hold all of the strings defined in that class; for each string, it will then generate achar[]
of suitable size, read the appropriate number of characters from the class file into thechar[]
, create aString
encapsulating those characters, and store the reference into the class'sString[]
.When compiling some class (e.g.
Foo
), the compiler knows which string literal it encounters first, second, third, fifth, etc. If code saysmyString = "George";
and George was the sixth string literal, that will appear in code as a "load string literal #6" instruction; the just-at-time compiler, when it is generating code for that instruction, will generate an instruction to fetch the sixth string reference associated with that class.The Java compiler has special support for string literals. Suppose it did not, then it would be really cumbersome to create strings in your source code, you'd have to write something like:
To answer your questions:
More or less, and if you really want to know the details, you'd have to study the source code of the JVM, which you can find at OpenJDK, but be warned that it's huge and complicated.
No, those two are not equivalent. In the first case you are explicitly creating a new
String
object:which will contain a copy of the
String
object represented by the literal"Test"
. Note that it is never a good idea to writenew String("some literal")
in Java - strings are immutable, and it is never necessary to make a copy of a string literal.There's no way I know of to check what's in the string pool.
I believe that the underlying mechanism for creating a String is a StringBuilder which assembles the String object at the end. At least I know for sure that if you have a string that you want to change, for example:
So what this does is it creates a StrigBuilder from the old object and replaces it with a new one that is constructed from the builder. This is why it is more memory efficient to use StringBuilder instead of a regular string to which you would just append stuff.
There is a way to access the already created pool of String which is by using the String.intern() method. It tells java to use the same memory space for Strings which are the same and gives you a reference to that place in memory. This also allows you to use the == operator to compare strings and is more memory efficient.
That's not tightly related to the subject, but whenever you have doubts as to what will java compiler do, you can use the
to print what is actually going on. (CompiledClassName from the dir where CompiledClassName.class is)
To add to Jesper's answer, there are more mechanisms at work, like when you concatenate a String from literals or final variables, it will still use the intern pool:
But when you concatenate using non-final variables it will not use the pool:
No. What really happens is - the String Literals are resolved during compile time and interned (added to the String constants pool) as soon as the class is loaded / initialized or lazily. Thus, they are made available to the classes within the JVM. Note that, even if you have a String with value
"hi"
in the Strings constants pool,new String("hi")
will create another String on the heap and return its reference.No, in the first case 2 "Test" Strings are created. One will be added to the String constants pool (assuming it is not already present there) and another on the heap. The second one can be GCed.In the second case, only one String literal is present in the String constants pool and there are 2 references to it (
s
ands1
).I don't think we can see the contents of the String constants pool. We can merely assume and confirm the behavior based on our assumptions.
String pool as it is pool of string stored in heap for exp:
both gets stored in heap and refers to a single "Test" thus s1=s, while
is an object that also get stored in heap but different form s1=s refer here