Size of empty Java String

2019-02-22 02:42发布

问题:

I heard a colleague say that I would pay "24 bytes" if I dropped a String member in a Java class, even if the String is empty. Is that accurate? Is it the same for Integer, Float, Double? (as opposed to int, float, double, which would be only 4, 4 and 8 bytes each).

回答1:

You'll pay 4 or 8 bytes for the reference. Whether you'll pay for an extra object per instance of your "container" object depends on how you get your empty string. For example, if you use the literal "" then all the instances will refer to the same object, so you'll only need to pay for the reference itself.

If you're creating a separate empty string for each instance, then obviously that will take more memory.



回答2:

Borrowed from this answer: the program prints 32 bytes for the empty string (and 0 for "" which is in the string pool).

public static void main(String... args) {
    long free1 = free();
    String s = "";
    long free2 = free();
    String s2 = new String("");
    long free3 = free();
    if (free3 == free1) System.err.println("You need to use -XX:-UseTLAB");
    System.out.println("\"\" took " + (free1 - free2) + " bytes and new String(\"\") took " + (free2
            - free3) + " bytes.");
}

private static long free() {
    return Runtime.getRuntime().freeMemory();
}


回答3:

It's true that you'll pay much more for an Integer than for an int. I remember checking a couple Java versions back and the Integer took about 24 bytes more. As long as you've got the String pointing at a null object (aka at nothing) you're only keeping a pointer in memory and I don't think the JVM will preserve a location to initialize it in which case you're not wasting 24 bytes, just 8. If you create the string though (even the empty string "") then you already have an object in memory, and since all objects inherit Object they come with some baggage and take up more memory than you intuitively expect. Depending on your use of the string a common solution is to start with a null object and lazy initialize it when you need it.



回答4:

String is composed out of object header (2 words on HotSpot, 3 on J9), int field and char array reference. Char array itself is composed out of header, int field and the rest of array. Everything is padded to 8 bytes. Usually, chars are encoded with 2 bytes per char. Some JVMs can use alternative encodings for char. At one point J9 could encode strings in UTF8. Modern HotSpot will use 1 byte per char if string is composed of simple chars (essentially 1 byte encoded glyphs of UTF-8, like english alphabet, numbers and such). Otherwise it uses 2 bytes per char for whole array.

So deep size of empty string size is thus:

  • 32 bytes on 32 bit HotSpot (16 bytes for String, 16 bytes for char[])
  • 56 bytes on 64 bit HotSpot (32 bytes for String, 24 bytes for char[])
  • 40 bytes on 64 bit HotSpot with compressed references (24 bytes for String, 16 bytes for char[])

Commonly, empty char array will not be allocated as it's value will be inferred. After all all empty char arrays are equal. So actual penalty for an empty string will be a shallow String size; which is 24 bytes for most common situation today (64 bit HotSpot with compressed ref).

IBM J9 JVM has bigger object header and it will allocate somewhat more.