JPA/Eclipselink cache life span

2019-06-26 09:45发布

1.- I'm working Glassfish 2.1 with EcipseLink 2.0.0, so really using JPA 1.0 specification, and I have a stateless EJB that finds entities among other things. As far as i know JPA 1.0 defines a L1 cache that works at Persistence Context level (transaction level for stateless EJBs) but I can't figure out why the next code prints "Not same instance" if it's within the same transaction.

@Stateless
@TransactionAttribute(TransactionAttributeType.REQUIRES_NEW)    
public class EntityServiceBean implements EntityServiceLocal {
    @PersistenceContext(unitName = "Model")
    private EntityManager entityManager;
    @Override
    public <T> T find(Class<T> type, Object id) {
        T entity = entityManager.find(type, id);
        if(entity != entityManager.find(type, id)) {
            System.out.println("Not same instance");
        }
        return entity;
    }
    ....
}

I even tried with the property:

<property name="eclipselink.cache.type.default" value="Full"/>

in the persistence.xml file, but does the same.

2.- What i would really like to achieve, if possible, is that multiple calls to my stateless EJB return the same instance, in other words span the JPA cache life across transactions and Persistence Contexts using stateless EJBs, for example:

... // POJO class
EntityServiceLocal entityService = ...
Product pA = entityService.find(Product.class, 1l);
...
Product pB = entityService.find(Product.class, 1l);
System.out.println("Same instance?" + pA == pB); // TRUE

I read that many JPA implementations make use of a L2 cache(now defined in JPA 2.0) that spans multiple Persistence Contexts even with JPA 1.0 but I don't know if misunderstood the L2 cache concept and/or I'm missing any configuration.

Is this possible? Or what can i do to avoid reading more than 20k entities from the DB every minute to update the ones that need it?

3条回答
我想做一个坏孩纸
2楼-- · 2019-06-26 09:50

I'm working Glassfish 2.1 with EcipseLink 2.0.0, so really using JPA 1.0 specification, and I have a stateless EJB that finds entities among other things. As far as i know JPA 1.0 defines a L1 cache that works at Persistence Context level (transaction level for stateless EJBs) but I can't figure out why the next code prints "Not same instance" if it's within the same transaction.

THIS is extremely weird, object identity should definitely be maintained inside a transaction in a Java EE context. This is very well documented in the JPA wiki book:

Object Identity

Object identity in Java means if two variables (x, y) refer to the same logical object, then x == y returns true. Meaning that both reference the same thing (both a pointer to the same memory location).

In JPA object identity is maintained within a transaction, and (normally) within the same EntityManager. The exception is in a JEE managed EntityManager, object identity is only maintained inside of a transaction.

So the following is true in JPA:

Employee employee1 = entityManager.find(Employee.class, 123);
Employee employee2 = entityManager.find(Employee.class, 123);
assert (employee1 == employee2);

This holds true no matter how the object is accessed:

Employee employee1 = entityManager.find(Employee.class, 123);
Employee employee2 = employee1.getManagedEmployees().get(0).getManager();
assert (employee1 == employee2);

In JPA object identity is not maintained across EntityManagers. Each EntityManager maintains its own persistence context, and its own transactional state of its objects.

So the following is true in JPA:

EntityManager entityManager1 = factory.createEntityManager();
EntityManager entityManager2 = factory.createEntityManager();
Employee employee1 = entityManager1.find(Employee.class, 123);
Employee employee2 = entityManager2.find(Employee.class, 123);
assert (employee1 != employee2);

Object identity is normally a good thing, as it avoids having your application manage multiple copies of objects, and avoids the application changing one copy, but not the other. The reason different EntityManagers or transactions (in JEE) don't maintain object identity is that each transaction must isolate its changes from other users of the system. This is also normally a good thing, however it does require the application to be aware of copies, detached objects and merging.

Some JPA products may have a concept of read-only objects, in which object identity may be maintained across EntityManagers through a shared object cache.

And I couldn't reproduce the problem with EclipseLink 2.0 in a Java SE environment (within a transaction and the same EntityManager) - sorry I won't test under GF 2.1.

I even tried with the property: <property name="eclipselink.cache.type.default" value="Full"/> in the persistence.xml file, but does the same

There is nothing to "activate" for the L1 cache.

What I would really like to achieve, if possible, is that multiple calls to my stateless EJB return the same instance, in other words span the JPA cache life across transactions and Persistence Contexts using stateless EJBs (...):

A L2 cache is indeed a cache that spans multiple transactions and EntityManagers and L2 caching is supported by most JPA providers. But while L2 caching will reduce database hits, object identity is not guaranteed with all providers.

For example, with Hibernate, the L2 cache isn't enabled by default and you won't get object identity as Hibernate doesn't put the entities themselves in cache.

With EclipseLink, L2 cache is enabled by default and you'll get object identity depending on the cache type. The default is the SOFT-WEAK cache of size 100 and it does preserve object identity. While you can configure things very finely (down to the entity level), for distributed environment or not, things should work by default.

See also

查看更多
Explosion°爆炸
3楼-- · 2019-06-26 09:54

We have found that the Eclipselink L2 entity cache doesn't seem to do what it's supposed to do at all. We have relatively few objects to keep in the L2 cache from the database, yet our memory usage spirals out of control over the course of 24 hours until we run out of heap (4GB worth). I have profiled the application and confirmed that the memory usage is coming from Eclipselink's L2 Cache. We are using SOFT-WEAK, and it doesn't ever evacuate old objects as far as we can tell (something the docs suggest it's supposed to do), and creates multiple instance of objects whose properties are the same (which the docs suggest it's NOT supposed to do), this is a read-only application for goodness sake. We have also been utterly unsuccessful getting any manner of cache-coordination in a clustered environment to work. We've had three senior Java devs with over a decade of experience each, one of which actually worked at IBM on their Java implementation remain unsuccessful in getting any cache-coordination to function, spending time periodically on the problem over the course of the last six months even tracing through the Eclipselink source to figure out what was going wrong. It's getting pretty tiresome rebooting our application every 24 hours, and I'm just about done with L2 caching at this point. I only hope we can get acceptable performance without it, at least Eclipselink JPA 1 implementations could manage to do parallel reads which seem to have evaporated in JPA 2. Database implementors solved this stuff years ago with MVCC, why is it that a set of smart Java devs can't manage the same feat given that it's already been done? All we're doing is moving it upstream essentially and expressing the data as objects instead of tuples and pages.

查看更多
Anthone
4楼-- · 2019-06-26 10:07

What other settings have you set, and how have you mapped your class?

Two finds in the same transaction should always be the same instance.

You can have find() return the same instance across transaction boundaries if you mark the object, or query as read-only (@ReadOnly, "eclipselink.read-only"="true"). You will need to ensure that you use it as read-only though. Trying to allows updates to the same instance shared by different transactions is not possible and not a good idea.

查看更多
登录 后发表回答