We have a JEE app that uses about 40 partitioned jobs on a cluster. It can be deployed on both JBoss and WebSphere. We are experiencing 2 problems:
messaging system failures in both JBoss and WebSphere typically related to temporary queue connection problems
partitioned jobs effectively hung because of lost messages.
I read a posting that switching the reply-destination
of the outbound-gateway
can improve robustness and allow for re-connection in the case of failures. The inbound-gateway
basically starts 2 listeners on the requestQueue.
<int-jms:inbound-gateway id="springbatch.inbound.gateway"
connection-factory="springbatch.jmsConnectionFactory"
request-channel="springbatch.slave.jms.request"
request-destination="requestsQueue"
reply-channel="springbatch.slave.jms.response"
concurrent-consumers="2"
max-concurrent-consumers="2"/>
Each job has a separate outbound-channel
.
<int-jms:outbound-gateway
connection-factory="springbatch.jmsConnectionFactory"
request-channel="jms.channel.1"
request-destination="requestsQueue"
reply-channel="jms.channel.2"
reply-destination="repliesQueue"
correlation-key="JMSCorrelationID" >
<int-jms:reply-listener />
</int-jms:outbound-gateway>
It runs fine on a single server but when run on a cluster the partitions run around the cluster but the master step does not get acknowledgement. I thought the JMSCoordinationID
as the correlation-key would handle matching up the JMS messages.
Am I missing a configuration piece?
What you have should work; in that mode, the correlation id is set to
gatewayId+n
(wheregatewayId
is aUUID
ann
increments). The reply container message selector is set toJMSCorrelationID LIKE gatewayId%
so step execution results should be correctly routed back to the master. I suggest you turn onDEBUG
logging and follow the messages on both sides to see what's happening.EDIT:
Re: Sharing JMS Endpoints (comment below).
It can be done, but would need a little restructuring.
On the producer (master) side, the gateway and a stand-alone aggregator would have to move to a parent context (with each job context being a child of it). Since the partition handler has to be in the child context, you would need a separate aggregator class; that said, the aggregation is orthogonal to the partitioning, it's just in that bean for convenience. A common aggregator is fine because it uses the partition handler's correlation id for the job execution and the reassembled step execution results will be routed to the right partition handler.
The consumer (slave) side is a bit more tricky because if the inbound gateway is in a single (parent) context) it won't have visibility to the
stepExecutionRequestHandler
s' channels in the child context; you would need to build a router to route the requests to the appropriate job contexts. Not impossible, just a bit more work.The dynamic-ftp Spring Integration sample and its README is a good starting point.