Why is Java's String memory usage said to be h

2020-05-26 02:52发布

问题:

On this blog post, it's said that the minimum memory usage of a String is:

8 * (int) ((((no chars) * 2) + 45) / 8) bytes.

So for the String "Apple Computers", the minimum memory usage would be 72 bytes.
Even if I have 10,000 String objects of twice that length, the memory usage would be less than 2Mb, which isn't much at all. So does that mean I'm underestimating the amount of Strings present in an enterprise application, or is that formula wrong?

Thanks

回答1:

String storage in Java depends on how the string was obtained. The backing char array can be shared between multiple instances. If that isn't the case, you have the usual object overhead plus storage for one pointer and three ints which usually comes out to 16 bytes overhead. Then the backing array requires 2 bytes per char since chars are UTF-16 code units.

For "Apple Computers" where the backing array is not shared, the minimum cost is going to be

  1. backing array for 16 chars -- 32B which aligns nicely on a word boundary.
  2. pointer to array - 4 or 8B depending on the platform
  3. three ints for the offset, length, and memoized hashcode - 12B
  4. 2 x object overhead - depends on the VM, but 8B is a good rule of thumb.
  5. one int for the array length.

So roughly 72B of which the actual payload constitutes 44.4%. The payload constitutes more for longer strings.


In Java7, some JDK implementations are doing away with backing array sharing to avoid pinning large char[]s in memory. That allows them to do away with 2 of the three ints.

That changes the calculation to 64B for a string of length 16 of which the actual payload constitutes 50%.



回答2:

Is it possible to save character data using less memory than a Java String? Yes.

Does it matter for "enterprise" applications (or even Android or J2ME applications, which have to get by on a lot less memory)? Almost never.

Premature optimization is the root...



回答3:

Compared to a other data types that you have, it is definitely high. The other primitives use 32 bits,64 bits,etc.

And given that String is immutable, every time you perform any operation on it, you end up creating a new String object, consuming even more memory.