I've read everywhere that when you define an Integer between -128 to 127 in Java, instead of creating a new object it returns an object already created.
I don't see any point of doing this other than letting newbie programmers compare Integer objects with ==
to see if they are the same number, but I think this is bad because sure they think that they can compare any Integer with ==
, and also is teaching a bad practice in any programming language: comparing the content of two 'different' objects with ==
.
Is there any other reason on why this is done? Or is it just a bad decision when designing the language (In my point of view) like optional semicolon in JavaScript?
EDIT: I see here that they explain the behaviour: Why does the behavior of the Integer constant pool change at 127?
I'm asking why they designed it to have this behaviour, and not why is this behaviour happening.
It's called the Flyweight pattern and is used to minimize memory usage.
Those numbers are very likely to be used repeatedly, and autobox types like Integer
are immutable (note this is done not just for Integer
). Caching them makes it so there aren't lots of instances and reduces GC (Garbage Collection) work as well.
The JLS covers this in 5.1.7. Boxing Conversion specifically by saying:
If the value p being boxed is true, false, a byte, or a char in the range \u0000 to \u007f, or an int or short number between -128 and 127 (inclusive), then let r1 and r2 be the results of any two boxing conversions of p. It is always the case that r1 == r2.
Ideally, boxing a given primitive value p, would always yield an identical reference. In practice, this may not be feasible using existing implementation techniques. The rules above are a pragmatic compromise. The final clause above requires that certain common values always be boxed into indistinguishable objects. The implementation may cache these, lazily or eagerly. For other values, this formulation disallows any assumptions about the identity of the boxed values on the programmer's part. This would allow (but not require) sharing of some or all of these references.
This ensures that in most common cases, the behavior will be the desired one, without imposing an undue performance penalty, especially on small devices. Less memory-limited implementations might, for example, cache all char and short values, as well as int and long values in the range of -32K to +32K.
I think that creating any object takes more time than taking it from the symbol table. Moreover, if I am not mistaken, every object on the heap takes up 24 bytes of additional space for the header. Now, if a programmer writes his/her program, most of the operations are done on small ints (in this case, small Integers). So it allows to save a lot of space and to improve performance a little.