Apache.NMS.ActiveMQ client hang when connection ha

2019-08-21 05:22发布

问题:

Our client connects to cluster with 2 ActiveMQ nodes. Apache.NMS.ActiveMQ library hang when we drop network connection manually between client and cluster. We use Apache.NMS.ActiveMQ with version 1.7.2 and cluster based on ActiveMQ 5.15.3.

Connection URI is:

failover:(tcp://dd-amq-app01:61616,tcp://dd-amq-app02:61616)?transport.maxReconnectAttempts=0&transport.startupMaxReconnectAttempts=0&transport.useExponentialBackOff=false&transport.maxReconnectDelay=1000&transport.timeout=10000 

I expect that library should throw exception when it cann't connect or reconnect, but it hang on forever.

Tracer logs:

INFO    2018-08-23 15:29:19,995 [135] DC.ActiveMQ.ActiveMqLogger BrokerUri set = failover:(tcp://dd-amq-app01:61616,tcp://dd-amq-app02:61616)?transport.maxReconnectAttempts=0&transport.startupMaxReconnectAttempts=0&transport.useExponentialBackOff=false&transport.maxReconnectDelay=1000&transport.timeout=10000 
DEBUG   2018-08-23 15:29:19,995 [135] DC.ActiveMQ.ActiveMqLogger SetProperties called with target: ConnectionFactory, and prefix: connection. 
DEBUG   2018-08-23 15:29:19,995 [135] DC.ActiveMQ.ActiveMqLogger SetProperties called with target: ConnectionFactory, and prefix: nms. 
INFO    2018-08-23 15:29:19,995 [46] DC.ActiveMQ.ActiveMqLogger Connecting to: failover:(tcp://dd-amq-app01:61616,tcp://dd-amq-app02:61616)?transport.maxreconnectdelay=1000&transport.timeout=10000&transport.maxreconnectattempts=0&transport.startupmaxreconnectattempts=0&transport.useexponentialbackoff=false 
INFO    2018-08-23 15:29:19,995 [135] DC.ActiveMQ.ActiveMqLogger Connecting to: failover:(tcp://dd-amq-app01:61616,tcp://dd-amq-app02:61616)?transport.maxreconnectdelay=1000&transport.timeout=10000&transport.maxreconnectattempts=0&transport.startupmaxreconnectattempts=0&transport.useexponentialbackoff=false 
DEBUG   2018-08-23 15:29:20,011 [4] DC.ActiveMQ.ActiveMqLogger Consumer[ID:vm-dc-test2-49838-636706331613744874-1:1165:1:1] sending Individual Ack for MessageId: ID:vm-dc-test2-49838-636706331613744874-1:1246:1:1:1:184415 
DEBUG   2018-08-23 15:29:20,011 [4] DC.ActiveMQ.ActiveMqLogger Session sending Ack: MessageAck[ commandId = 0, responseRequired = False, Destination = queue://EdiInbound, TransactionId = , ConsumerId = ID:vm-dc-test2-49838-636706331613744874-1:1165:1:1, AckType = 4, FirstMessageId = , LastMessageId = ID:vm-dc-test2-49838-636706331613744874-1:1246:1:1:1:184415, MessageCount = 1, PoisonCause =  ] 
DEBUG   2018-08-23 15:29:20,011 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1172]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:20,011 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1172]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:20,011 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1170]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:20,011 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1170]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:20,011 [135] DC.ActiveMQ.ActiveMqLogger Reconnect was triggered but transport is not started yet. Wait for start to connect the transport. 
DEBUG   2018-08-23 15:29:20,011 [135] DC.ActiveMQ.ActiveMqLogger FailoverTransport Started. 
DEBUG   2018-08-23 15:29:20,011 [135] DC.ActiveMQ.ActiveMqLogger FailoverTransport not connected, start is reconnecting. 
DEBUG   2018-08-23 15:29:20,011 [135] DC.ActiveMQ.ActiveMqLogger Waking up reconnect task 
DEBUG   2018-08-23 15:29:20,011 [135] DC.ActiveMQ.ActiveMqLogger Waiting for transport to reconnect. 
DEBUG   2018-08-23 15:29:20,011 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Uri connection list: tcp://dd-amq-app02:61616/,tcp://dd-amq-app01:61616/ from: tcp://dd-amq-app01:61616/,tcp://dd-amq-app02:61616/ 
DEBUG   2018-08-23 15:29:20,011 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1171]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:20,011 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1171]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:20,011 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Attempting 0th connect to:  
DEBUG   2018-08-23 15:29:20,011 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Opening socket to: dd-amq-app02 on port: 61616 
DEBUG   2018-08-23 15:29:20,011 [46] DC.ActiveMQ.ActiveMqLogger Reconnect was triggered but transport is not started yet. Wait for start to connect the transport. 
DEBUG   2018-08-23 15:29:20,011 [46] DC.ActiveMQ.ActiveMqLogger FailoverTransport Started. 
DEBUG   2018-08-23 15:29:20,011 [46] DC.ActiveMQ.ActiveMqLogger FailoverTransport not connected, start is reconnecting. 
DEBUG   2018-08-23 15:29:20,011 [46] DC.ActiveMQ.ActiveMqLogger Waking up reconnect task 
DEBUG   2018-08-23 15:29:20,011 [46] DC.ActiveMQ.ActiveMqLogger Waiting for transport to reconnect. 
DEBUG   2018-08-23 15:29:20,011 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Connected to dd-amq-app02:61616 using InterNetwork protocol. 
DEBUG   2018-08-23 15:29:20,026 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Creating Inactivity Monitor: 1173 
DEBUG   2018-08-23 15:29:20,011 [ActiveMQ Failover Worker: 18711279] DC.ActiveMQ.ActiveMqLogger Uri connection list: tcp://dd-amq-app02:61616/,tcp://dd-amq-app01:61616/ from: tcp://dd-amq-app01:61616/,tcp://dd-amq-app02:61616/ 
DEBUG   2018-08-23 15:29:20,026 [ActiveMQ Failover Worker: 18711279] DC.ActiveMQ.ActiveMqLogger Attempting 0th connect to:  
DEBUG   2018-08-23 15:29:20,026 [ActiveMQ Failover Worker: 18711279] DC.ActiveMQ.ActiveMqLogger Opening socket to: dd-amq-app02 on port: 61616 
DEBUG   2018-08-23 15:29:20,026 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Connection established 
DEBUG   2018-08-23 15:29:20,026 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Connection[ID:vm-dc-test2-49838-636706331613744874-1:1247]: Transport has resumed normal operation. 
INFO    2018-08-23 15:29:20,026 [ActiveMQ Failover Worker: 22429852] DC.ActiveMQ.ActiveMqLogger Successfully connected to tcp://dd-amq-app02:61616/ 
DEBUG   2018-08-23 15:29:21,683 [9] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1094]: Message sent since last write check. Resetting flag. 
DEBUG   2018-08-23 15:29:21,683 [9] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1094]: A read check is not currently allowed. 
DEBUG   2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Consumer[ID:vm-dc-test2-49838-636706331613744874-1:1165:1:1] No Active TX or pending acks, closing normally. 
DEBUG   2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Shutdown of Consumer[ID:vm-dc-test2-49838-636706331613744874-1:1165:1:1] started. 
DEBUG   2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Shutdown of Consumer[ID:vm-dc-test2-49838-636706331613744874-1:1165:1:1] completed. 
DEBUG   2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Remove of Consumer[ID:vm-dc-test2-49838-636706331613744874-1:1165:1:1] of destination [queue://EdiInbound] sent last delivered Id[184415]. 
INFO    2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Closing The Session with Id ID:vm-dc-test2-49838-636706331613744874-1:1165:1 
INFO    2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Executing Shutdown on Session with Id ID:vm-dc-test2-49838-636706331613744874-1:1165:1 
INFO    2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Closed The Session with Id ID:vm-dc-test2-49838-636706331613744874-1:1165:1 
INFO    2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Connection[ID:vm-dc-test2-49838-636706331613744874-1:1165]: Closing Connection Now. 
DEBUG   2018-08-23 15:29:30,012 [4] DC.ActiveMQ.ActiveMqLogger Connection[ID:vm-dc-test2-49838-636706331613744874-1:1165]: Synchronously disposing of Connection. 
DEBUG   2018-08-23 15:29:31,699 [79] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1094]: Message sent since last write check. Resetting flag. 
DEBUG   2018-08-23 15:29:31,699 [79] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1094]: A read check is not currently allowed. 
   at Apache.NMS.ActiveMQ.Transport.FutureResponse.get_Response() in c:\dev\NMS.ActiveMQ\src\main\csharp\Transport\FutureResponse.cs:line 57 
DEBUG   2018-08-23 15:29:35,027 [4] DC.ActiveMQ.ActiveMqLogger Connection[ID:vm-dc-test2-49838-636706331613744874-1:1165]: Disposing of the Transport. 
DEBUG   2018-08-23 15:29:35,027 [4] DC.ActiveMQ.ActiveMqLogger FailoverTransport Stopped. 
DEBUG   2018-08-23 15:29:35,027 [4] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1094].Runner: Task Runner Shut Down 
DEBUG   2018-08-23 15:29:35,027 [4] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1094]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:35,027 [135] DC.ActiveMQ.ActiveMqLogger Send Oneway attempt: 0 failed: Message = Wire format negotiation timeout: peer did not send his wire format. 
DEBUG   2018-08-23 15:29:35,027 [135] DC.ActiveMQ.ActiveMqLogger Failed Message Was: ConnectionInfo[ commandId = 1, responseRequired = True, ConnectionId = ID:vm-dc-test2-49838-636706331613744874-1:1247, ClientId = 208221b7-139f-464b-ad6f-2fa99243faea, Password = systemPassword, UserName = system, BrokerPath = , BrokerMasterConnector = False, Manageable = False, ClientMaster = False, FaultTolerant = True, FailoverReconnect = False, ClientIp =  ] 
DEBUG   2018-08-23 15:29:35,027 [135] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1173]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:35,027 [15] DC.ActiveMQ.ActiveMqLogger TransportFilter disposing of next Transport: MutexTransport 
DEBUG   2018-08-23 15:29:35,027 [15] DC.ActiveMQ.ActiveMqLogger TransportFilter disposing of next Transport: FailoverTransport 
DEBUG   2018-08-23 15:29:35,027 [15] DC.ActiveMQ.ActiveMqLogger FailoverTransport Stopped. 
DEBUG   2018-08-23 15:29:35,027 [4] DC.ActiveMQ.ActiveMqLogger TransportFilter disposing of next Transport: MutexTransport 
DEBUG   2018-08-23 15:29:35,027 [4] DC.ActiveMQ.ActiveMqLogger TransportFilter disposing of next Transport: FailoverTransport 
DEBUG   2018-08-23 15:29:35,027 [4] DC.ActiveMQ.ActiveMqLogger FailoverTransport Already Stopped. 
DEBUG   2018-08-23 15:29:35,043 [135] DC.ActiveMQ.ActiveMqLogger FailoverTransport Started. 
DEBUG   2018-08-23 15:29:35,043 [135] DC.ActiveMQ.ActiveMqLogger FailoverTransport not connected, start is reconnecting. 
DEBUG   2018-08-23 15:29:35,043 [135] DC.ActiveMQ.ActiveMqLogger Waking up reconnect task 
DEBUG   2018-08-23 15:29:35,574 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1173]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:35,574 [2] DC.ActiveMQ.ActiveMqLogger InactivityMonitor[1173]: Stopped Monitor Threads. 
DEBUG   2018-08-23 15:29:38,059 [ActiveMQ Failover Worker: 63267701] DC.ActiveMQ.ActiveMqLogger Connect fail to: tcp://dd-amq-app01:61616/, reason: Error connecting to dd-amq-app01:61616. 
DEBUG   2018-08-23 15:29:38,059 [ActiveMQ Failover Worker: 63267701] DC.ActiveMQ.ActiveMqLogger Attempting 0th connect to: tcp://dd-amq-app01:61616/ 
DEBUG   2018-08-23 15:29:38,059 [ActiveMQ Failover Worker: 63267701] DC.ActiveMQ.ActiveMqLogger Opening socket to: dd-amq-app02 on port: 61616 
DEBUG   2018-08-23 15:29:41,043 [ActiveMQ Failover Worker: 18711279] DC.ActiveMQ.ActiveMqLogger Connect fail to: tcp://dd-amq-app02:61616/, reason: Error connecting to dd-amq-app02:61616. 
DEBUG   2018-08-23 15:29:41,043 [ActiveMQ Failover Worker: 18711279] DC.ActiveMQ.ActiveMqLogger Attempting 0th connect to: tcp://dd-amq-app02:61616/ 
DEBUG   2018-08-23 15:29:41,043 [ActiveMQ Failover Worker: 18711279] DC.ActiveMQ.ActiveMqLogger Opening socket to: dd-amq-app01 on port: 61616 
DEBUG   2018-08-23 15:29:59,059 [ActiveMQ Failover Worker: 63267701] DC.ActiveMQ.ActiveMqLogger Connect fail to: tcp://dd-amq-app02:61616/, reason: Error connecting to dd-amq-app02:61616. 
ERROR   2018-08-23 15:29:59,059 [ActiveMQ Failover Worker: 63267701] DC.ActiveMQ.ActiveMqLogger Failed to connect to tcp://dd-amq-app01:61616/,tcp://dd-amq-app02:61616/ after: 1 attempt(s) 
DEBUG   2018-08-23 15:29:59,059 [52] DC.ActiveMQ.ActiveMqLogger TransportFilter disposing of next Transport: MutexTransport 
DEBUG   2018-08-23 15:29:59,059 [52] DC.ActiveMQ.ActiveMqLogger TransportFilter disposing of next Transport: FailoverTransport 
DEBUG   2018-08-23 15:29:59,059 [52] DC.ActiveMQ.ActiveMqLogger FailoverTransport Stopped. 

What can I do to prevent hungs? May be other connection parameters in URI?

回答1:

Lets break the Uri

failover:(tcp://dd-amq-app01:61616,tcp://dd-amq-app02:61616)?transport.maxReconnectAttempts=0&transport.startupMaxReconnectAttempts=0&transport.useExponentialBackOff=false&transport.maxReconnectDelay=1000&transport.timeout=10000 
  • The failover:(tcp://dd-amq-app01:61616,tcp://dd-amq-app02:61616) suggest there are two uri as primary & secondary

It will keep on moving as Primary to secondary to Primary... as it is trying to connect

  • transport.maxReconnectAttempts=0 says to try forever

You can have the uri as

failover:(tcp://dd-amq-app01:61616,tcp://dd-amq-app02:61616)?transport.timeout=10000&maxReconnectAttempts=1

And try to catch the Exception

It will throw the Timeout exception after trying once the Timeout expired.

Please refer to http://activemq.apache.org/failover-transport-reference