Prevent Hibernate from deleting orphaned entities

2019-01-26 03:15发布

问题:

Taking a very simple example of one-to-many relationship (country -> state).

Country (inverse side) :

@OneToMany(mappedBy = "country", fetch = FetchType.LAZY, cascade = CascadeType.ALL, orphanRemoval = true)
private List<StateTable> stateTableList=new ArrayList<StateTable>(0);

StateTable (owning side) :

@JoinColumn(name = "country_id", referencedColumnName = "country_id")
@ManyToOne(fetch = FetchType.LAZY, cascade = {CascadeType.PERSIST, CascadeType.MERGE, CascadeType.REFRESH, CascadeType.DETACH})
private Country country;

The method attempting to update a supplied (detached) StateTable entity within an active database transaction (JTA or resource local) :

public StateTable update(StateTable stateTable) {

    // Getting the original state entity from the database.
    StateTable oldState = entityManager.find(StateTable.class, stateTable.getStateId());
    // Get hold of the original country (with countryId = 67, for example).
    Country oldCountry = oldState.getCountry();
    // Getting a new country entity (with countryId = 68) supplied by the client application which is responsible for modifying the StateTable entity.
    // Country has been changed from 67 to 68 in the StateTable entity using for example, a drop-down list.
    Country newCountry = entityManager.find(Country.class, stateTable.getCountry().getCountryId());
    // Attaching a managed instance to StateTable.
    stateTable.setCountry(newCountry);

    // Check whether the supplied country and the original country entities are equal.
    // (Both not null and not equal - http://stackoverflow.com/a/31761967/1391249)
    if (ObjectUtils.notEquals(newCountry, oldCountry)) {
        // Remove the state entity from the inverse collection held by the original country entity.
        oldCountry.remove(oldState);
        // Add the state entity to the inverse collection held by the newly supplied country entity
        newCountry.add(stateTable);
    }

    return entityManager.merge(stateTable);
}

It should be noted that orphanRemoval is set to true. The StateTable entity is supplied by a client application which is interested in changing the entity association Country (countryId = 67) in StateTable to something else (countryId = 68) (thus on the inverse side in JPA, migrating a child entity from its parent (collection) to another parent (collection) which orphanRemoval=true will in turn oppose).

The Hibernate provider issues a DELETE DML statement causing the row corresponding to the StateTable entity to be removed from the underlying database table.

Despite the fact that orphanRemoval is set to true, I expect Hibernate to issue a regularUPDATE DML statement causing the effect of orphanRemoval to be suspended in its entirely because the relationship link is migrated (not simply deleted).

EclipseLink does exactly that job. It issues an UPDATE statement in the scenario given (having the same relationship with orphanRemoval set to true).

Which one is behaving according to the specification? Is it possible to make Hibernate issue an UPDATE statement in this case other than removing orphanRemoval from the inverse side?


This is only an attempt to make a bidirectional relationship more consistent on both the sides.

The defensive link management methods namely add() and remove() used in the above snippet, if necessary, are defined in the Country entity as follows.

public void add(StateTable stateTable) {
    List<StateTable> newStateTableList = getStateTableList();

    if (!newStateTableList.contains(stateTable)) {
        newStateTableList.add(stateTable);
    }

    if (stateTable.getCountry() != this) {
        stateTable.setCountry(this);
    }
}

public void remove(StateTable stateTable) {
    List<StateTable> newStateTableList = getStateTableList();

    if (newStateTableList.contains(stateTable)) {
        newStateTableList.remove(stateTable);
    }
}


Update :

Hibernate can only issue an expected UPDATE DML statement, if the code given is modified in the following way.

public StateTable update(StateTable stateTable) {
    StateTable oldState = entityManager.find(StateTable.class, stateTable.getStateId());
    Country oldCountry = oldState.getCountry();
    // DELETE is issued, if getReference() is replaced by find().
    Country newCountry = entityManager.getReference(Country.class, stateTable.getCountry().getCountryId());

    // The following line is never expected as Country is already retrieved 
    // and assigned to oldCountry above.
    // Thus, oldState.getCountry() is no longer an uninitialized proxy.
    oldState.getCountry().hashCode(); // DELETE is issued, if removed.
    stateTable.setCountry(newCountry);

    if (ObjectUtils.notEquals(newCountry, oldCountry)) {
        oldCountry.remove(oldState);
        newCountry.add(stateTable);
    }

    return entityManager.merge(stateTable);
}

Observe the following two lines in the newer version of the code.

// Previously it was EntityManager#find()
Country newCountry = entityManager.getReference(Country.class, stateTable.getCountry().getCountryId());
// Previously it was absent.
oldState.getCountry().hashCode();

If either the last line is absent or EntityManager#getReference() is replaced by EntityManager#find(), then a DELETE DML statement is unexpectedly issued.

So, what is going on here? Especially, I emphasize portability. Not porting this kind of basic functionality across different JPA providers defeats the use of ORM frameworks severely.

I understand the basic difference between EntityManager#getReference() and EntityManager#find().

回答1:

Firstly, let's change your original code to a simpler form :

StateTable oldState = entityManager.find(StateTable.class, stateTable.getStateId());
Country oldCountry = oldState.getCountry();
oldState.getCountry().hashCode(); // DELETE is issued, if removed.

Country newCountry = entityManager.find(Country.class, stateTable.getCountry().getCountryId());
stateTable.setCountry(newCountry);

if (ObjectUtils.notEquals(newCountry, oldCountry)) {
    oldCountry.remove(oldState);
    newCountry.add(stateTable);
}

entityManager.merge(stateTable);

Notice that I only added oldState.getCountry().hashCode() in the third line. Now you can reproduce your issue by removing this line only.

Before we explain what's going on here, first some excerpts from the JPA 2.1 specification.

Section 3.2.4:

The semantics of the flush operation, applied to an entity X are as follows:

  • If X is a managed entity, it is synchronized to the database.
    • For all entities Y referenced by a relationship from X, if the relationship to Y has been annotated with the cascade element value cascade=PERSIST or cascade=ALL, the persist operation is applied to Y

Section 3.2.2:

The semantics of the persist operation, applied to an entity X are as follows:

  • If X is a removed entity, it becomes managed.

orphanRemoval JPA javadoc:

(Optional) Whether to apply the remove operation to entities that have been removed from the relationship and to cascade the remove operation to those entities.

As we can see, orphanRemoval is defined in terms of remove operation, so all the rules that apply for remove must apply for orphanRemoval as well.

Secondly, as explained in this answer, the order of updates executed by Hibernate is the order in which entities are loaded in the persistence context. To be more precise, updating an entity means synchronizing its current state (dirty check) with the database and cascading the PERSIST operation to its associations.

Now, this is what's happening in your case. At the end of the transaction Hibernate synchronizes the persistence context with the database. We have two scenarios:

  1. When the extra line (hashCode) is present :

    1. Hibernate synchronizes oldCountry with the DB. It does it before handling newCountry, because oldCountry was loaded first (proxy initialization forced by calling hashCode).
    2. Hibernate sees that a StateTable instance has been removed from the oldCountry's collection, thus marking the StateTable instance as removed.
    3. Hibernate synchronizes newCountry with the DB. The PERSIST operation cascades to the stateTableList which now contains the removed StateTable entity instance.
    4. The removed StateTable instance is now managed again (3.2.2 section of JPA specification quoted above).
  2. When the extra line (hashCode) is absent :

    1. Hibernate synchronizes newCountry with the DB. It does it before handling oldCountry, because newCountry was loaded first (with entityManager.find).
    2. Hibernate synchronizes oldCountry with the DB.
    3. Hibernate sees that a StateTable instance has been removed from the oldCountry's collection, thus marking the StateTable instance as removed.
    4. The removal of the StateTable instance is synchronized with the database.

The order of updates also explains your findings in which you basically forced oldCountry proxy initialization to happen before loading newCountry from the DB.

So, is this according to the JPA specification? Obviously yes, no JPA spec rule is broken.

Why is this not portable?

JPA specification (like any other specification after all) gives freedom to the providers to define many details not covered by the spec.

Also, that depends on your view of the 'portability'. The orphanRemoval feature and any other JPA features are portable when it comes to their formal definitions. However, it depends on how you use them in combination with the specifics of your JPA provider.

By the way, section 2.9 of the spec recommends (but does not clearly define) for the orphanRemoval:

Portable applications must otherwise not depend upon a specific order of removal, and must not reassign an entity that has been orphaned to another relationship or otherwise attempt to persist it.

But this is just an example of vague or not-well-defined recommendations in the spec, because persisting of removed entities is allowed by other statements in the specification.



回答2:

As soon as your referenced entity can be used in other parents, it gets complicated anyway. To really make it clean, the ORM had to search in the database for any other usages of the removed entity before deleting it (persistent garbage collection). This is time consuming and therefore not really useful and therefore not implemented in Hibernate.

Delete orphans only works if your child is used for a single parent and never reused somewhere else. You may even get an exception when trying to reuse it to better detect the misuse of this feature.

Decide whether you want to keep delete orphans or not. If you want to keep it, you need to create a new child for the new parent instead of moving it.

If you abandon delete orphans, you have to delete the children yourself as soon as they are not referenced anymore.