The JPA hashCode() / equals() dilemma

2019-01-01 03:21发布

问题:

There have been some discussions here about JPA entities and which hashCode()/equals() implementation should be used for JPA entity classes. Most (if not all) of them depend on Hibernate, but I\'d like to discuss them JPA-implementation-neutrally (I am using EclipseLink, by the way).

All possible implementations are having their own advantages and disadvantages regarding:

  • hashCode()/equals() contract conformity (immutability) for List/Set operations
  • Whether identical objects (e.g. from different sessions, dynamic proxies from lazily-loaded data structures) can be detected
  • Whether entities behave correctly in detached (or non-persisted) state

As far I can see, there are three options:

  1. Do not override them; rely on Object.equals() and Object.hashCode()
    • hashCode()/equals() work
    • cannot identify identical objects, problems with dynamic proxies
    • no problems with detached entities
  2. Override them, based on the primary key
    • hashCode()/equals() are broken
    • correct identity (for all managed entities)
    • problems with detached entities
  3. Override them, based on the Business-Id (non-primary key fields; what about foreign keys?)
    • hashCode()/equals() are broken
    • correct identity (for all managed entities)
    • no problems with detached entities

My questions are:

  1. Did I miss an option and/or pro/con point?
  2. What option did you choose and why?



UPDATE 1:

By \"hashCode()/equals() are broken\", I mean that successive hashCode() invocations may return differing values, which is (when correctly implemented) not broken in the sense of the Object API documentation, but which causes problems when trying to retrieve a changed entity from a Map, Set or other hash-based Collection. Consequently, JPA implementations (at least EclipseLink) will not work correctly in some cases.

UPDATE 2:

Thank you for your answers -- most of them have remarkable quality.
Unfortunately, I am still unsure which approach will be the best for a real-life application, or how to determine the best approach for my application. So, I\'ll keep the question open and hope for some more discussions and/or opinions.

回答1:

Read this very nice article on the subject: Don\'t Let Hibernate Steal Your Identity.

The conclusion of the article goes like this:

Object identity is deceptively hard to implement correctly when objects are persisted to a database. However, the problems stem entirely from allowing objects to exist without an id before they are saved. We can solve these problems by taking the responsibility of assigning object IDs away from object-relational mapping frameworks such as Hibernate. Instead, object IDs can be assigned as soon as the object is instantiated. This makes object identity simple and error-free, and reduces the amount of code needed in the domain model.



回答2:

I always override equals/hashcode and implement it based on the business id. Seems the most reasonable solution for me. See the following link.

To sum all this stuff up, here is a listing of what will work or won\'t work with the different ways to handle equals/hashCode: \"enter

EDIT:

To explain why this works for me:

  1. I don\'t usually use hashed-based collection (HashMap/HashSet) in my JPA application. If I must, I prefer to create UniqueList solution.
  2. I think changing business id on runtime is not a best practice for any database application. On rare cases where there is no other solution, I\'d do special treatment like remove the element and put it back to the hashed-based collection.
  3. For my model, I set the business id on constructor and doesn\'t provide setters for it. I let JPA implementation to change the field instead of the property.
  4. UUID solution seems to be overkill. Why UUID if you have natural business id? I would after all set the uniqueness of the business id in the database. Why having THREE indexes for each table in the database then?


回答3:

We usually have two IDs in our entities:

  1. Is for persistence layer only (so that persistence provider and database can figure out relationships between objects).
  2. Is for our application needs (equals() and hashCode() in particular)

Take a look:

@Entity
public class User {

    @Id
    private int id;  // Persistence ID
    private UUID uuid; // Business ID

    // assuming all fields are subject to change
    // If we forbid users change their email or screenName we can use these
    // fields for business ID instead, but generally that\'s not the case
    private String screenName;
    private String email;

    // I don\'t put UUID generation in constructor for performance reasons. 
    // I call setUuid() when I create a new entity
    public User() {
    }

    // This method is only called when a brand new entity is added to 
    // persistence context - I add it as a safety net only but it might work 
    // for you. In some cases (say, when I add this entity to some set before 
    // calling em.persist()) setting a UUID might be too late. If I get a log 
    // output it means that I forgot to call setUuid() somewhere.
    @PrePersist
    public void ensureUuid() {
        if (getUuid() == null) {
            log.warn(format(\"User\'s UUID wasn\'t set on time. \" 
                + \"uuid: %s, name: %s, email: %s\",
                getUuid(), getScreenName(), getEmail()));
            setUuid(UUID.randomUUID());
        }
    }

    // equals() and hashCode() rely on non-changing data only. Thus we 
    // guarantee that no matter how field values are changed we won\'t 
    // lose our entity in hash-based Sets.
    @Override
    public int hashCode() {
        return getUuid().hashCode();
    }

    // Note that I don\'t use direct field access inside my entity classes and
    // call getters instead. That\'s because Persistence provider (PP) might
    // want to load entity data lazily. And I don\'t use 
    //    this.getClass() == other.getClass() 
    // for the same reason. In order to support laziness PP might need to wrap
    // my entity object in some kind of proxy, i.e. subclassing it.
    @Override
    public boolean equals(final Object obj) {
        if (this == obj)
            return true;
        if (!(obj instanceof User))
            return false;
        return getUuid().equals(((User) obj).getUuid());
    }

    // Getters and setters follow
}

EDIT: to clarify my point regarding calls to setUuid() method. Here\'s a typical scenario:

User user = new User();
// user.setUuid(UUID.randomUUID()); // I should have called it here
user.setName(\"Master Yoda\");
user.setEmail(\"yoda@jedicouncil.org\");

jediSet.add(user); // here\'s bug - we forgot to set UUID and 
                   //we won\'t find Yoda in Jedi set

em.persist(user); // ensureUuid() was called and printed the log for me.

jediCouncilSet.add(user); // Ok, we got a UUID now

When I run my tests and see the log output I fix the problem:

User user = new User();
user.setUuid(UUID.randomUUID());

Alternatively, one can provide a separate constructor:

@Entity
public class User {

    @Id
    private int id;  // Persistence ID
    private UUID uuid; // Business ID

    ... // fields

    // Constructor for Persistence provider to use
    public User() {
    }

    // Constructor I use when creating new entities
    public User(UUID uuid) {
        setUuid(uuid);
    }

    ... // rest of the entity.
}

So my example would look like this:

User user = new User(UUID.randomUUID());
...
jediSet.add(user); // no bug this time

em.persist(user); // and no log output

I use a default constructor and a setter, but you may find two-constructors approach more suitable for you.



回答4:

If you want to use equals()/hashCode() for your Sets, in the sense that the same entity can only be in there once, then there is only one option: Option 2. That\'s because a primary key for an entity by definition never changes (if somebody indeed updates it, it\'s not the same entity anymore)

You should take that literally: Since your equals()/hashCode() are based on the primary key, you must not use these methods, until the primary key is set. So you shouldn\'t put entities in the set, until they\'re assigned a primary key. (Yes, UUIDs and similar concepts may help to assign primary keys early.)

Now, it\'s theoretically also possible to achieve that with Option 3, even though so-called \"business-keys\" have the nasty drawback that they can change: \"All you\'ll have to do is delete the already inserted entities from the set(s), and re-insert them.\" That is true - but it also means, that in a distributed system, you\'ll have to make sure, that this is done absolutely everywhere the data has been inserted to (and you\'ll have to make sure, that the update is performed, before other things occur). You\'ll need a sophisticated update mechanism, especially if some remote systems aren\'t currently reachable...

Option 1 can only be used, if all the objects in your sets are from the same Hibernate session. The Hibernate documentation makes this very clear in chapter 13.1.3. Considering object identity:

Within a Session the application can safely use == to compare objects.

However, an application that uses == outside of a Session might produce unexpected results. This might occur even in some unexpected places. For example, if you put two detached instances into the same Set, both might have the same database identity (i.e., they represent the same row). JVM identity, however, is by definition not guaranteed for instances in a detached state. The developer has to override the equals() and hashCode() methods in persistent classes and implement their own notion of object equality.

It continues to argue in favor of Option 3:

There is one caveat: never use the database identifier to implement equality. Use a business key that is a combination of unique, usually immutable, attributes. The database identifier will change if a transient object is made persistent. If the transient instance (usually together with detached instances) is held in a Set, changing the hashcode breaks the contract of the Set.

This is true, if you

  • cannot assign the id early (e.g. by using UUIDs)
  • and yet you absolutely want to put your objects in sets while they\'re in transient state.

Otherwise, you\'re free to choose Option 2.

Then it mentions the need for a relative stability:

Attributes for business keys do not have to be as stable as database primary keys; you only have to guarantee stability as long as the objects are in the same Set.

This is correct. The practical problem I see with this is: If you can\'t guarantee absolute stability, how will you be able to guarantee stability \"as long as the objects are in the same Set\". I can imagine some special cases (like using sets only for a conversation and then throwing it away), but I would question the general practicability of this.


Short version:

  • Option 1 can only be used with objects within a single session.
  • If you can, use Option 2. (Assign PK as early as possible, because you can\'t use the objects in sets until the PK is assigned.)
  • If you can guarantee relative stability, you can use Option 3. But be careful with this.


回答5:

I personally already used all of these three stategies in different projects. An I must say that option 1 is in my opinion the most practicable in a real life app. A made the experience breaking hashCode()/equals() conformity leads to many crazy bugs as you will every time end up in situations where result of equality changes after an entity has been added to a collection.

But there are further options (also with their pros and cons):


a) hashCode/equals based on a set of immutable, not null, constructor assigned, fields

(+) all three criterias are guaranteed

(-) field values must be available to create a new instance

(-) complicate handling if you must change one of then


b) hashCode/equals based on primary key that is assigned by application (in constructor) instead of JPA

(+) all three criterias are guaranteed

(-) you cannot take advantage of simple reliable ID generation stategies like DB sequences

(-) complicated if new entities are created in a distributed environment (client/server) or app server cluster


c) hashCode/equals based on a UUID assigned by constructor of entity

(+) all three criterias are guaranteed

(-) overhead of UUID generation

(-) may be a little risk that twice the same UUID is used, depending on algorythm used (may be detected by an unique index on DB)



回答6:

Although using a business key (option 3) is the most commonly recommended approach (Hibernate community wiki, \"Java Persistence with Hibernate\" p. 398), and this is what we mostly use, there\'s a Hibernate bug which breaks this for eager-fetched sets: HHH-3799. In this case, Hibernate can add an entity to a set before its fields are initialized. I\'m not sure why this bug hasn\'t gotten more attention, as it really makes the recommended business-key approach problematic.

I think the heart of the matter is that equals and hashCode should be based on immutable state (reference Odersky et al.), and a Hibernate entity with Hibernate-managed primary key has no such immutable state. The primary key is modified by Hibernate when a transient object becomes persistent. The business key is also modified by Hibernate, when it hydrates an object in the process of being initialized.

That leaves only option 1, inheriting the java.lang.Object implementations based on object identity, or using an application-managed primary key as suggested by James Brundege in \"Don\'t Let Hibernate Steal Your Identity\" (already referenced by Stijn Geukens\'s answer) and by Lance Arlaus in \"Object Generation: A Better Approach to Hibernate Integration\".

The biggest problem with option 1 is that detached instances can\'t be compared with persistent instances using .equals(). But that\'s OK; the contract of equals and hashCode leaves it up to the developer to decide what equality means for each class. So just let equals and hashCode inherit from Object. If you need to compare a detached instance to a persistent instance, you can create a new method explicitly for that purpose, perhaps boolean sameEntity or boolean dbEquivalent or boolean businessEquals.



回答7:

  1. If you have a business key, then you should use that for equals/hashCode.
  2. If you don\'t have a business key, you should not leave it with the default Object equals and hashCode implementations because that does not work after you merge and entity.
  3. You can use the entity identifier as suggested in this post. The only catch is that you need to use a hashCode implementation that always returns the same value, like this:

    @Entity
    public class Book implements Identifiable<Long> {
    
        @Id
        @GeneratedValue
        private Long id;
    
        private String title;
    
        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (!(o instanceof Book)) return false;
            Book book = (Book) o;
            return getId() != null && Objects.equals(getId(), book.getId());
        }
    
        @Override
        public int hashCode() {
            return 31;
        }
    
        //Getters and setters omitted for brevity
    }
    


回答8:

I agree with Andrew\'s answer. We do the same thing in our application but instead of storing UUIDs as VARCHAR/CHAR, we split it into two long values. See UUID.getLeastSignificantBits() and UUID.getMostSignificantBits().

One more thing to consider, is that calls to UUID.randomUUID() are pretty slow, so you might want to look into lazily generating the UUID only when needed, such as during persistence or calls to equals()/hashCode()

@MappedSuperclass
public abstract class AbstractJpaEntity extends AbstractMutable implements Identifiable, Modifiable {

    private static final long   serialVersionUID    = 1L;

    @Version
    @Column(name = \"version\", nullable = false)
    private int                 version             = 0;

    @Column(name = \"uuid_least_sig_bits\")
    private long                uuidLeastSigBits    = 0;

    @Column(name = \"uuid_most_sig_bits\")
    private long                uuidMostSigBits     = 0;

    private transient int       hashCode            = 0;

    public AbstractJpaEntity() {
        //
    }

    public abstract Integer getId();

    public abstract void setId(final Integer id);

    public boolean isPersisted() {
        return getId() != null;
    }

    public int getVersion() {
        return version;
    }

    //calling UUID.randomUUID() is pretty expensive, 
    //so this is to lazily initialize uuid bits.
    private void initUUID() {
        final UUID uuid = UUID.randomUUID();
        uuidLeastSigBits = uuid.getLeastSignificantBits();
        uuidMostSigBits = uuid.getMostSignificantBits();
    }

    public long getUuidLeastSigBits() {
        //its safe to assume uuidMostSigBits of a valid UUID is never zero
        if (uuidMostSigBits == 0) {
            initUUID();
        }
        return uuidLeastSigBits;
    }

    public long getUuidMostSigBits() {
        //its safe to assume uuidMostSigBits of a valid UUID is never zero
        if (uuidMostSigBits == 0) {
            initUUID();
        }
        return uuidMostSigBits;
    }

    public UUID getUuid() {
        return new UUID(getUuidMostSigBits(), getUuidLeastSigBits());
    }

    @Override
    public int hashCode() {
        if (hashCode == 0) {
            hashCode = (int) (getUuidMostSigBits() >> 32 ^ getUuidMostSigBits() ^ getUuidLeastSigBits() >> 32 ^ getUuidLeastSigBits());
        }
        return hashCode;
    }

    @Override
    public boolean equals(final Object obj) {
        if (obj == null) {
            return false;
        }
        if (!(obj instanceof AbstractJpaEntity)) {
            return false;
        }
        //UUID guarantees a pretty good uniqueness factor across distributed systems, so we can safely
        //dismiss getClass().equals(obj.getClass()) here since the chance of two different objects (even 
        //if they have different types) having the same UUID is astronomical
        final AbstractJpaEntity entity = (AbstractJpaEntity) obj;
        return getUuidMostSigBits() == entity.getUuidMostSigBits() && getUuidLeastSigBits() == entity.getUuidLeastSigBits();
    }

    @PrePersist
    public void prePersist() {
        // make sure the uuid is set before persisting
        getUuidLeastSigBits();
    }

}


回答9:

As other people way smarter than me has pointed out already, there\'s a numerous amount of strategies out there. It seems to be the case though that the majority of applied design patterns try to hack their way to success. They limit constructor access if not hinder constructor invocations completely with specialized constructors and factory methods. Indeed it is always pleasant with a clear cut API. But if the sole reason is to make the equals- and hashcode overrides be compatible with the application, then I wonder if those strategies are in compliance with KISS (Keep It Simple Stupid).

For me, I like to override equals and hashcode by way of examining the id. In these methods, I require the id to not be null and document this behavior well. Thus it will become the developers contract to persist a new entity before storing him somewhere else. An application that does not honor this contract would fail within the minute (hopefully).

Word of caution though: If your entities are stored in different tables and your provider use an auto-generation strategy for the primary key, then you\'ll get duplicated primary keys across entity types. In such case, also compare run time types with a call to Object#getClass() which of course will make it impossible that two different types are considered equal. That suits me just fine for the most part.



回答10:

There are obviously already very informative answers here but I will tell you what we do.

We do nothing (ie do not override).

If we do need equals/hashcode to work for collections we use UUIDs. You just create the UUID in the constructor. We use http://wiki.fasterxml.com/JugHome for UUID. UUID is a little more expensive CPU wise but is cheap compared to serialization and db access.



回答11:

Business keys approach doesn\'t suit for us. We use DB generated ID, temporary transient tempId and override equal()/hashcode() to solve the dilemma. All entities are descendants of Entity. Pros:

  1. No extra fields in DB
  2. No extra coding in descendants entities, one approach for all
  3. No performance issues (like with UUID), DB Id generation
  4. No problem with Hashmaps (don\'t need to keep in mind the use of equal & etc.)
  5. Hashcode of new entity doesn\'t changed in time even after persisting

Cons:

  1. There are may be problems with serializing and deserializing not persisted entities
  2. Hashcode of the saved entity may change after reloading from DB
  3. Not persisted objects considered always different (maybe this is right?)
  4. What else?

Look at our code:

@MappedSuperclass
abstract public class Entity implements Serializable {

    @Id
    @GeneratedValue
    @Column(nullable = false, updatable = false)
    protected Long id;

    @Transient
    private Long tempId;

    public void setId(Long id) {
        this.id = id;
    }

    public Long getId() {
        return id;
    }

    private void setTempId(Long tempId) {
        this.tempId = tempId;
    }

    // Fix Id on first call from equal() or hashCode()
    private Long getTempId() {
        if (tempId == null)
            // if we have id already, use it, else use 0
            setTempId(getId() == null ? 0 : getId());
        return tempId;
    }

    @Override
    public boolean equals(Object obj) {
        if (super.equals(obj))
            return true;
        // take proxied object into account
        if (obj == null || !Hibernate.getClass(obj).equals(this.getClass()))
            return false;
        Entity o = (Entity) obj;
        return getTempId() != 0 && o.getTempId() != 0 && getTempId().equals(o.getTempId());
    }

    // hash doesn\'t change in time
    @Override
    public int hashCode() {
        return getTempId() == 0 ? super.hashCode() : getTempId().hashCode();
    }
}


回答12:

Please consider the following approach based on predefined type identifier and the ID.

The specific assumptions for JPA:

  • entities of the same \"type\" and the same non-null ID are considered equal
  • non-persisted entities (assuming no ID) are never equal to other entities

The abstract entity:

@MappedSuperclass
public abstract class AbstractPersistable<K extends Serializable> {

  @Id @GeneratedValue
  private K id;

  @Transient
  private final String kind;

  public AbstractPersistable(final String kind) {
    this.kind = requireNonNull(kind, \"Entity kind cannot be null\");
  }

  @Override
  public final boolean equals(final Object obj) {
    if (this == obj) return true;
    if (!(obj instanceof AbstractPersistable)) return false;
    final AbstractPersistable<?> that = (AbstractPersistable<?>) obj;
    return null != this.id
        && Objects.equals(this.id, that.id)
        && Objects.equals(this.kind, that.kind);
  }

  @Override
  public final int hashCode() {
    return Objects.hash(kind, id);
  }

  public K getId() {
    return id;
  }

  protected void setId(final K id) {
    this.id = id;
  }
}

Concrete entity example:

static class Foo extends AbstractPersistable<Long> {
  public Foo() {
    super(\"Foo\");
  }
}

Test example:

@Test
public void test_EqualsAndHashcode_GivenSubclass() {
  // Check contract
  EqualsVerifier.forClass(Foo.class)
    .suppress(Warning.NONFINAL_FIELDS, Warning.TRANSIENT_FIELDS)
    .withOnlyTheseFields(\"id\", \"kind\")
    .withNonnullFields(\"id\", \"kind\")
    .verify();
  // Ensure new objects are not equal
  assertNotEquals(new Foo(), new Foo());
}

Main advantages here:

  • simplicity
  • ensures subclasses provide type identity
  • predicted behavior with proxied classes

Disadvantages:

  • Requires each entity to call super()

Notes:

  • Needs attention when using inheritance. E.g. instance equality of class A and class B extends A may depend on concrete details of the application.
  • Ideally, use a business key as the ID

Looking forward to your comments.



回答13:

I have always used option 1 in the past because I was aware of these discussions and thought it was better to do nothing until I knew the right thing to do. Those systems are all still running successfully.

However, next time I may try option 2 - using the database generated Id.

Hashcode and equals will throw IllegalStateException if the id is not set.

This will prevent subtle errors involving unsaved entities from appearing unexpectedly.

What do people think of this approach?



回答14:

This is a common problem in every IT system that uses Java and JPA. The pain point extends beyond implementing equals() and hashCode(), it affects how an organization refer to an entity and how its clients refer to the same entity. I\'ve seen enough pain of not having a business key to the point that I wrote my own blog to express my view.

In short: use a short, human readable, sequential ID with meaningful prefixes as business key that\'s generated without any dependency on any storage other than RAM. Twitter\'s Snowflake is a very good example.



回答15:

IMO you have 3 options for implementing equals/hashCode

  • Use an application generated identity i.e. a UUID
  • Implement it based on a business key
  • Implement it based on the primary key

Using an application generated identity is the easiest approach, but comes with a few downsides

  • Joins are slower when using it as PK because 128 Bit is simply bigger than 32 or 64 Bit
  • \"Debugging is harder\" because checking with your own eyes wether some data is correct is pretty hard

If you can work with these downsides, just use this approach.

To overcome the join issue one could be using the UUID as natural key and a sequence value as primary key, but then you might still run into the equals/hashCode implementation problems in compositional child entities that have embedded ids since you will want to join based on the primary key. Using the natural key in child entities id and the primary key for referring to the parent is a good compromise.

@Entity class Parent {
  @Id @GeneratedValue Long id;
  @NaturalId UUID uuid;
  @OneToMany(mappedBy = \"parent\") Set<Child> children;
  // equals/hashCode based on uuid
}

@Entity class Child {
  @EmbeddedId ChildId id;
  @ManyToOne Parent parent;

  @Embeddable class ChildId {
    UUID parentUuid;
    UUID childUuid;
    // equals/hashCode based on parentUuid and childUuid
  }
  // equals/hashCode based on id
}

IMO this is the cleanest approach as it will avoid all downsides and at the same time provide you a value(the UUID) that you can share with external systems without exposing system internals.

Implement it based on a business key if you can expect that from a user is a nice idea, but comes with a few downsides as well

Most of the time this business key will be some kind of code that the user provides and less often a composite of multiple attributes.

  • Joins are slower because joining based on variable length text is simply slow. Some DBMS might even have problems creating an index if the key exceeds a certain length.
  • In my experience, business keys tend to change which will require cascading updates to objects referring to it. This is impossible if external systems refer to it

IMO you shouldn\'t implement or work with a business key exclusively. It\'s a nice add-on i.e. users can quickly search by that business key, but the system shouldn\'t rely on it for operating.

Implement it based on the primary key has it\'s problems, but maybe it\'s not such a big deal

If you need to expose ids to external system, use the UUID approach I suggested. If you don\'t, you could still use the UUID approach but you don\'t have to. The problem of using a DBMS generated id in equals/hashCode stems from the fact that the object might have been added to hash based collections before assigning the id.

The obvious way to get around this is to simply not add the object to hash based collections before assigning the id. I understand that this is not always possible because you might want deduplication before assigning the id already. To still be able to use the hash based collections, you simply have to rebuild the collections after assigning the id.

You could do something like this:

@Entity class Parent {
  @Id @GeneratedValue Long id;
  @OneToMany(mappedBy = \"parent\") Set<Child> children;
  // equals/hashCode based on id
}

@Entity class Child {
  @EmbeddedId ChildId id;
  @ManyToOne Parent parent;

  @PrePersist void postPersist() {
    parent.children.remove(this);
  }
  @PostPersist void postPersist() {
    parent.children.add(this);
  }

  @Embeddable class ChildId {
    Long parentId;
    @GeneratedValue Long childId;
    // equals/hashCode based on parentId and childId
  }
  // equals/hashCode based on id
}

I haven\'t tested the exact approach myself, so I\'m not sure how changing collections in pre- and post-persist events works but the idea is:

  • Temporarily Remove the object from hash based collections
  • Persist it
  • Re-add the object to the hash based collections

Another way of solving this is to simply rebuild all your hash based models after an update/persist.

In the end, it\'s up to you. I personally use the sequence based approach most of the time and only use the UUID approach if I need to expose an identifier to external systems.



回答16:

If UUID is the answer for many people, why don\'t we just use factory methods from business layer to create the entities and assign primary key at creation time?

for example:

@ManagedBean
public class MyCarFacade {
  public Car createCar(){
    Car car = new Car();
    em.persist(car);
    return car;
  }
}

this way we would get a default primary key for the entity from the persistence provider, and our hashCode() and equals() functions could rely on that.

We could also declare the Car\'s constructors protected and then use reflection in our business method to access them. This way developers would not be intent on instantiate Car with new, but through factory method.

How\'bout that?



回答17:

I tried to answer this question myself and was never totally happy with found solutions until i read this post and especially DREW one. I liked the way he lazy created UUID and optimally stored it.

But I wanted to add even more flexibility, ie lazy create UUID ONLY when hashCode()/equals() is accessed before first persistence of the entity with each solution\'s advantages :

  • equals() means \"object refers to the same logical entity\"
  • use database ID as much as possible because why would I do the work twice (performance concern)
  • prevent problem while accessing hashCode()/equals() on not yet persisted entity and keep the same behaviour after it is indeed persisted

I would really apreciate feedback on my mixed-solution below

public class MyEntity { 

    @Id()
    @Column(name = \"ID\", length = 20, nullable = false, unique = true)
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id = null;

    @Transient private UUID uuid = null;

    @Column(name = \"UUID_MOST\", nullable = true, unique = false, updatable = false)
    private Long uuidMostSignificantBits = null;
    @Column(name = \"UUID_LEAST\", nullable = true, unique = false, updatable = false)
    private Long uuidLeastSignificantBits = null;

    @Override
    public final int hashCode() {
        return this.getUuid().hashCode();
    }

    @Override
    public final boolean equals(Object toBeCompared) {
        if(this == toBeCompared) {
            return true;
        }
        if(toBeCompared == null) {
            return false;
        }
        if(!this.getClass().isInstance(toBeCompared)) {
            return false;
        }
        return this.getUuid().equals(((MyEntity)toBeCompared).getUuid());
    }

    public final UUID getUuid() {
        // UUID already accessed on this physical object
        if(this.uuid != null) {
            return this.uuid;
        }
        // UUID one day generated on this entity before it was persisted
        if(this.uuidMostSignificantBits != null) {
            this.uuid = new UUID(this.uuidMostSignificantBits, this.uuidLeastSignificantBits);
        // UUID never generated on this entity before it was persisted
        } else if(this.getId() != null) {
            this.uuid = new UUID(this.getId(), this.getId());
        // UUID never accessed on this not yet persisted entity
        } else {
            this.setUuid(UUID.randomUUID());
        }
        return this.uuid; 
    }

    private void setUuid(UUID uuid) {
        if(uuid == null) {
            return;
        }
        // For the one hypothetical case where generated UUID could colude with UUID build from IDs
        if(uuid.getMostSignificantBits() == uuid.getLeastSignificantBits()) {
            throw new Exception(\"UUID: \" + this.getUuid() + \" format is only for internal use\");
        }
        this.uuidMostSignificantBits = uuid.getMostSignificantBits();
        this.uuidLeastSignificantBits = uuid.getLeastSignificantBits();
        this.uuid = uuid;
    }


回答18:

In practice it seems, that Option 2 (Primary key) is most frequently used. Natural and IMMUTABLE business key is seldom thing, creating and supporting synthetic keys are too heavy to solve situations, which are probably never happened. Have a look at spring-data-jpa AbstractPersistable implementation (the only thing: for Hibernate implementation use Hibernate.getClass).

public boolean equals(Object obj) {
    if (null == obj) {
        return false;
    }
    if (this == obj) {
        return true;
    }
    if (!getClass().equals(ClassUtils.getUserClass(obj))) {
        return false;
    }
    AbstractPersistable<?> that = (AbstractPersistable<?>) obj;
    return null == this.getId() ? false : this.getId().equals(that.getId());
}

@Override
public int hashCode() {
    int hashCode = 17;
    hashCode += null == getId() ? 0 : getId().hashCode() * 31;
    return hashCode;
}

Just aware of manipulating new objects in HashSet/HashMap. In opposite, the Option 1 (remain Object implementation) is broken just after merge, that is very common situation.

If you have no business key and have a REAL needs to manipulate new entity in hash structure, override hashCode to constant, as below Vlad Mihalcea was advised.



回答19:

Below is a simple (and tested) solution for Scala.

  • Note that this solution does not fit into any of the 3 categories given in the question.

  • All my Entities are subclasses of the UUIDEntity so I follow the don\'t-repeat-yourself (DRY) principle.

  • If needed the UUID generation can be made more precise (by using more pseudo-random numbers).

Scala Code:

import javax.persistence._
import scala.util.Random

@Entity
@Inheritance(strategy = InheritanceType.TABLE_PER_CLASS)
abstract class UUIDEntity {
  @Id  @GeneratedValue(strategy = GenerationType.TABLE)
  var id:java.lang.Long=null
  var uuid:java.lang.Long=Random.nextLong()
  override def equals(o:Any):Boolean= 
    o match{
      case o : UUIDEntity => o.uuid==uuid
      case _ => false
    }
  override def hashCode() = uuid.hashCode()
}