Will the jit optimize new objects

2019-01-15 15:57发布

I created this class for being immutable and having a fluent API:

public final class Message {
    public final String email;
    public final String escalationEmail;
    public final String assignee;
    public final String conversationId;
    public final String subject;
    public final String userId;

    public Message(String email, String escalationEmail, String assignee, String conversationId, String subject, String userId) {
        this.email = email;
        this.escalationEmail = escalationEmail;
        this.assignee = assignee;
        this.conversationId = conversationId;
        this.subject = subject;
        this.userId = userId;
    }

    public Message() {
        email = "";
        escalationEmail = "";
        assignee = "";
        conversationId = "";
        subject = "";
        userId = "";
    }

    public Message email(String e) { return new Message(e, escalationEmail, assignee, conversationId, subject, userId); }
    public Message escalationEmail(String e) { return new Message(email, e, assignee, conversationId, subject, userId); }
    public Message assignee(String a) { return new Message(email, escalationEmail, a, conversationId, subject, userId); }
    public Message conversationId(String c) { return new Message(email, escalationEmail, assignee, c, subject, userId); }
    public Message subject(String s) { return new Message(email, escalationEmail, assignee, conversationId, s, userId); }
    public Message userId(String u) { return new Message(email, escalationEmail, assignee, conversationId, subject, u); }

}

My question is, will the optimizer be able to avoid lots of object creations when a new object is created like this:

Message m = new Message()
    .email("foo@bar.com")
    .assignee("bar@bax.com")
    .subject("subj");

Is there anything to be gained from making a separate mutable builder object instead?

Update 2: After reading apangin's answer my benchmark is invalidated. I'll keep it here for reference of how not to benchmark :)

Update: I took the liberty of measuring this myself with this code:

public final class Message {
public final String email;
public final String escalationEmail;
public final String assignee;
public final String conversationId;
public final String subject;
public final String userId;

public static final class MessageBuilder {
    private String email;
    private String escalationEmail;
    private String assignee;
    private String conversationId;
    private String subject;
    private String userId;

    MessageBuilder email(String e) { email = e; return this; }
    MessageBuilder escalationEmail(String e) { escalationEmail = e; return this; }
    MessageBuilder assignee(String e) { assignee = e; return this; }
    MessageBuilder conversationId(String e) { conversationId = e; return this; }
    MessageBuilder subject(String e) { subject = e; return this; }
    MessageBuilder userId(String e) { userId = e; return this; }

    public Message create() {
        return new Message(email, escalationEmail, assignee, conversationId, subject, userId);
    }

}

public static MessageBuilder createNew() {
    return new MessageBuilder();
}

public Message(String email, String escalationEmail, String assignee, String conversationId, String subject, String userId) {
    this.email = email;
    this.escalationEmail = escalationEmail;
    this.assignee = assignee;
    this.conversationId = conversationId;
    this.subject = subject;
    this.userId = userId;
}

public Message() {
    email = "";
    escalationEmail = "";
    assignee = "";
    conversationId = "";
    subject = "";
    userId = "";
}

public Message email(String e) { return new Message(e, escalationEmail, assignee, conversationId, subject, userId); }
public Message escalationEmail(String e) { return new Message(email, e, assignee, conversationId, subject, userId); }
public Message assignee(String a) { return new Message(email, escalationEmail, a, conversationId, subject, userId); }
public Message conversationId(String c) { return new Message(email, escalationEmail, assignee, c, subject, userId); }
public Message subject(String s) { return new Message(email, escalationEmail, assignee, conversationId, s, userId); }
public Message userId(String u) { return new Message(email, escalationEmail, assignee, conversationId, subject, u); }


static String getString() {
    return new String("hello");
    // return "hello";
}

public static void main(String[] args) {
    int n = 1000000000;

    long before1 = System.nanoTime();

    for (int i = 0; i < n; ++i) {
        Message m = new Message()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString());
    }

    long after1 = System.nanoTime();

    long before2 = System.nanoTime();

    for (int i = 0; i < n; ++i) {
        Message m = Message.createNew()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString())
                .create();
    }

    long after2 = System.nanoTime();



    System.out.println("no builder  : " + (after1 - before1)/1000000000.0);
    System.out.println("with builder: " + (after2 - before2)/1000000000.0);
}


}

I found the difference to be significant (builder is faster) if the string arguments are not new objects, but all the same (see commented code in getString)

In what I imagine is a more realistic scenario, when all the strings are new objects, the difference is negligible, and the JVM startup would cause the first one to be a tiny bit slower (I tried both ways).

With the "new String" the code was in total many times slower as well (I had to decrease the n), perhaps indicating that there is some optimization of the "new Message" going on, but not of the "new String".

5条回答
乱世女痞
2楼-- · 2019-01-15 16:40

My understanding is that the JIT compiler works by re-arranging the existing code and performing basic statistical analysis. I don't think though that the JIT compiler can optimize object allocation.

Your Builder is incorrect and your fluent API will not work the way you expect (create just a single object per built).

You need to have something like:

  public class Message () {
     public final String email;
     public final String escalationEmail;

  private Message (String email,String escalationEmail) {
     this.email = email;
     this. escalationEmail = escalationEmail;
  }

  public static class Builder {
       public String email;
       public String escalationEmail;

       public static Builder createNew() {
           return new Builder();
       }

       public Builder withEmail(String email) {
          this.email = email;
          return this;
       }

       public Builder withEscalation(String escalation) {
          this.escalation = escalation;
          return this;
       }

       public Builder validate() {
          if (this.email==null|| this.email.length<7) {
             throw new RuntimeException("invalid email");
          }
       }


       public Message build() {¨
         return new Message(this.email,this.escalation);
       }

    } 

}

Then you can have something like.

Message.Builder.createNew()
                           .withEmail("exampple@email.com")
                           .withEscalation("escalation")
               .validate()
               .build();
查看更多
看我几分像从前
3楼-- · 2019-01-15 16:42

First, your code didn't have a builder approach and generate a lot of object, but there is already an example of a builder so I will not add one more.

Then, regarding the JIT, short answer NO (there is no optimization of new object creation, except for dead code) ... long answer no but ... there is other mechanism that will optimize stuff in the JVM/

There is a string pool that avoid creation of multiple strings when using string literals. There is also a pool of object for each primitive wrapper type (so if you create a Long object with Long.valueOf it's the same object that is returned each time you ask for the same long ...). Regarding strings, there is also a string deduplication mechanism integrated in the G1 garbadge collector in java 8 update 20. You can test it with the following JVM options if you're using a recent JVM : -XX:+UseG1GC -XX:+UseStringDeduplication

If you really want to optimize new objet creation, you need to implement some sort of Object pooling and have your object being immutable. But be careful that this is not a simple task and you will end up having a lot of code dealing with object creation and managing pool size to not overflow your memory. So I advise you to do it only if it's really necessary.

Lastly, object instantiation in the heap is a cheap operation unless you create millions of objects in a second and the JVM is doing a lot of optimization in a lot of fields so, unless some good performance benchmark (or memory profiling) prove that you have an issue with object instantiation don't think about it too much ;)

Regards,

Loïc

查看更多
贪生不怕死
4楼-- · 2019-01-15 16:48

will the optimizer be able to avoid lots of object creation

No, but instantiation is a very cheap operation on the JVM. Worrying about this performance loss would be a typical example of premature optimization.

Is there anything to be gained from making a separate mutable builder object instead?

Working with immutables is generally a good approach. On the other hand builders also won't hurt you, if you use the builder instances in a small context, so their mutable state is accessible only in a small, local envorironment. I don't see any severe disadvantages on any side, it is really up to your preference.

查看更多
可以哭但决不认输i
5楼-- · 2019-01-15 16:49

In builder pattern, you should do like this:

Message msg = Message.new()
.email("foo@bar.com")
.assignee("bar@bax.com")
.subject("subj").build();

Which Message.new() will create an object of builder class, the function email(..) and assignee(...) will return this. And the last build() function will create the Object based on your data.

查看更多
我欲成王,谁敢阻挡
6楼-- · 2019-01-15 16:53

Yes, HotSpot JIT can eliminate redundant allocations in a local context.

This optimization is provided by the Escape Analysis enabled since JDK 6u23. It is often confused with on-stack allocation, but in fact it is much more powerful, since it allows not only to allocate objects on stack, but to eliminate allocation altogether by replacing object fields with variables (Scalar Replacement) that are subject to further optimizations.

The optimization is controlled by -XX:+EliminateAllocations JVM option which is ON by default.


Thanks to allocation elimination optimization, both your examples of creating a Message object work effectively the same way. They do not allocate intermediate objects; just the final one.

Your benchmark shows misleading results, because it collects many common pitfalls of microbenchmarking:

  • it incorporates several benchmarks in a single method;
  • it measures an OSR stub instead of the final compiled version;
  • it does not do warm-up iterations;
  • it does not consume results, etc.

Let's measure it correctly with JMH. As a bonus, JMH has the allocation profiler (-prof gc) which shows how many bytes are really allocated per iteration. I've added the third test that runs with EliminateAllocations optimization disabled to show the difference.

package bench;

import org.openjdk.jmh.annotations.*;

@State(Scope.Benchmark)
public class MessageBench {

    @Benchmark
    public Message builder() {
        return Message.createNew()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString())
                .create();
    }

    @Benchmark
    public Message immutable() {
        return new Message()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString());
    }

    @Benchmark
    @Fork(jvmArgs = "-XX:-EliminateAllocations")
    public Message immutableNoOpt() {
        return new Message()
                .email(getString())
                .assignee(getString())
                .conversationId(getString())
                .escalationEmail(getString())
                .subject(getString())
                .userId(getString());
    }

    private String getString() {
        return "hello";
    }
}

Here are the results. Both builder and immutable perform equally and allocate just 40 bytes per iteration (exactly the size of one Message object).

Benchmark                                        Mode  Cnt     Score     Error   Units
MessageBench.builder                             avgt   10     6,232 ±   0,111   ns/op
MessageBench.immutable                           avgt   10     6,213 ±   0,087   ns/op
MessageBench.immutableNoOpt                      avgt   10    41,660 ±   2,466   ns/op

MessageBench.builder:·gc.alloc.rate.norm         avgt   10    40,000 ±   0,001    B/op
MessageBench.immutable:·gc.alloc.rate.norm       avgt   10    40,000 ±   0,001    B/op
MessageBench.immutableNoOpt:·gc.alloc.rate.norm  avgt   10   280,000 ±   0,001    B/op
查看更多
登录 后发表回答