I'm working on Wildfly 9 with Infinispan 7.2.3.
I'm facing up to a strange problem related to distributed cache:
- On the application server i have N deployed wars exposing REST services
- Each service code has the common duty to check if a CacheManager si already present on JNDI, if yes, it uses it otherwise i creates a new one and the bind it to the JNDI. So every war works with a unique CacheManager instance.
- The Infinispan CacheManager is configured in distributed mode.
The infinispan and jgroups are provided from the application server. After a re-deploy operation (undploy and deploy) of all the wars if i suddenly start to send REST request to these services i get this error:
18:23:42,366 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (transport-thread--p2-t12) ISPN000197: Error updating cluster member list: org.infinispan.util.concurrent.Timeout
Exception: Replication timeout for ws-7-aor-58034
at org.infinispan.remoting.transport.AbstractTransport.parseResponseAndAddToResponseList(AbstractTransport.java:87)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:586)
at org.infinispan.topology.ClusterTopologyManagerImpl.confirmMembersAvailable(ClusterTopologyManagerImpl.java:402)
at org.infinispan.topology.ClusterTopologyManagerImpl.updateCacheMembers(ClusterTopologyManagerImpl.java:393)
at org.infinispan.topology.ClusterTopologyManagerImpl.handleClusterView(ClusterTopologyManagerImpl.java:309)
at org.infinispan.topology.ClusterTopologyManagerImpl$ClusterViewListener$1.run(ClusterTopologyManagerImpl.java:590)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
18:23:42,539 WARN [org.infinispan.topology.ClusterTopologyManagerImpl] (remote-thread--p11-t2) ISPN000329: Unable to read rebalancing status from coordinator ws-7-aor-19211: org.infinispan.util.concurrent.TimeoutException: Node ws-7-aor-19211 timed out
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:248)
at org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeRemotely(JGroupsTransport.java:561)
at org.infinispan.topology.ClusterTopologyManagerImpl.fetchRebalancingStatusFromCoordinator(ClusterTopologyManagerImpl.java:129)
at org.infinispan.topology.ClusterTopologyManagerImpl.start(ClusterTopologyManagerImpl.java:118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.infinispan.commons.util.ReflectionUtil.invokeAccessibly(ReflectionUtil.java:168)
at org.infinispan.factories.AbstractComponentRegistry$PrioritizedMethod.invoke(AbstractComponentRegistry.java:869)
at org.infinispan.factories.AbstractComponentRegistry.invokeStartMethods(AbstractComponentRegistry.java:638)
at org.infinispan.factories.AbstractComponentRegistry.registerComponentInternal(AbstractComponentRegistry.java:207)
at org.infinispan.factories.AbstractComponentRegistry.registerComponent(AbstractComponentRegistry.java:156)
at org.infinispan.factories.AbstractComponentRegistry.getOrCreateComponent(AbstractComponentRegistry.java:277)
at org.infinispan.factories.AbstractComponentRegistry.invokeInjectionMethod(AbstractComponentRegistry.java:227)
at org.infinispan.factories.AbstractComponentRegistry.wireDependencies(AbstractComponentRegistry.java:132)
at org.infinispan.remoting.inboundhandler.GlobalInboundInvocationHandler$2.run(GlobalInboundInvocationHandler.java:156)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.jgroups.TimeoutException: timeout waiting for response from ws-7-aor-19211, request: org.jgroups.blocks.UnicastRequest@75770aa6, req_id=6, mode=GET_ALL, target=ws-7-aor-19211
at org.jgroups.blocks.MessageDispatcher.sendMessage(MessageDispatcher.java:427)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.processSingleCall(CommandAwareRpcDispatcher.java:433)
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.invokeRemoteCommand(CommandAwareRpcDispatcher.java:241)
... 19 more
This is the initalization code for cachemanager:
try {
ctx = new InitialContext();
cacheManager = (DefaultCacheManager)ctx.lookup(SessionConstants.CACHE_MANAGER_GLOBAL_JNDI_NAME);
} catch (NamingException e1) {
logger.error("SessionHooverJob not able to find: java:global/klopotekCacheManager ... a new instance will be created!");
}
if (cacheManager ==null){
...
configurator = ConfiguratorFactory.getStackConfigurator("default-configs/default-jgroups-udp.xml");
ProtocolConfiguration udpConfiguration = configurator.getProtocolStack().get(0);
if ("UDP".equalsIgnoreCase(udpConfiguration.getProtocolName()) && mcastAddr != null){
udpConfiguration.getProperties().put("mcast_addr", mcastAddr);
}
GlobalConfigurationBuilder gcb = new GlobalConfigurationBuilder();
gcb.globalJmxStatistics().enabled(true).allowDuplicateDomains(true);
gcb.transport().defaultTransport()
.addProperty(JGroupsTransport.CONFIGURATION_STRING, configurator.getProtocolStackString());
//.addProperty(JGroupsTransport.CONFIGURATION_FILE, "config/jgroups.xml");
ConfigurationBuilder builder = new ConfigurationBuilder();
builder.clustering().cacheMode(CacheMode.DIST_SYNC).expiration().lifespan(24l, TimeUnit.HOURS);;
cacheManager = new DefaultCacheManager(gcb.build(),
builder.build());
The problem doesn't occur if a time of around 40-60 seconds passes after deploying. If i have 1 JNDI session manager which have built the jgroups channel, even if i undeploy the all the war... why jgroups try to do rebalance again?
Is there some configuration property to set?