Tyrus client cannot reconnect on server restart wh

2019-08-13 00:04发布

问题:

Here is the scenario:

  1. Websocket server is up
  2. Tyrus client shared container is enabled
  3. Tyrus client connects to server (everything is good)
  4. Websocket server restarts
  5. Tyrus client cannot connect to the server and throws the following exception even after the server is up:

javax.websocket.DeploymentException: Connection failed

  1. If the client application is restarted it can connect to server again

Note: This issue will not happen if Tyrus client shared container is disabled.

stacktrace:

javax.websocket.DeploymentException: Connection failed.
    at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientSocket._connect(GrizzlyClientSocket.java:428) ~[tyrus-container-grizzly-client-1.11.jar:?]
    at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientSocket.access$000(GrizzlyClientSocket.java:103) ~[tyrus-container-grizzly-client-1.11.jar:?]
    at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientSocket$1.call(GrizzlyClientSocket.java:235) ~[tyrus-container-grizzly-client-1.11.jar:?]
    at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientSocket$1.call(GrizzlyClientSocket.java:231) ~[tyrus-container-grizzly-client-1.11.jar:?]
    at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientSocket.connect(GrizzlyClientSocket.java:249) ~[tyrus-container-grizzly-client-1.11.jar:?]
    at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientContainer.openClientSocket(GrizzlyClientContainer.java:95) ~[tyrus-container-grizzly-client-1.11.jar:?]
    at org.glassfish.tyrus.client.ClientManager$3$1.run(ClientManager.java:663) ~[tyrus-client-1.11.jar:?]
    at org.glassfish.tyrus.client.ClientManager$3.run(ClientManager.java:712) ~[tyrus-client-1.11.jar:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_31]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_31]
    at org.glassfish.tyrus.client.ClientManager$SameThreadExecutorService.execute(ClientManager.java:866) ~[tyrus-client-1.11.jar:?]
    at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112) ~[?:1.8.0_31]
    at org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:511) ~[tyrus-client-1.11.jar:?]
    at org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:373) ~[tyrus-client-1.11.jar:?]
    ...
Caused by: java.lang.NullPointerException
    at org.glassfish.grizzly.nio.RoundRobinConnectionDistributor$SharedIterator.next(RoundRobinConnectionDistributor.java:146) ~[grizzly-framework-2.3.15-gfa.jar:2.3.15-gfa]
    at org.glassfish.grizzly.nio.RoundRobinConnectionDistributor.registerChannelAsync(RoundRobinConnectionDistributor.java:101) ~[grizzly-framework-2.3.15-gfa.jar:2.3.15-gfa]
    at org.glassfish.grizzly.nio.transport.TCPNIOConnectorHandler.connectAsync(TCPNIOConnectorHandler.java:177) ~[grizzly-framework-2.3.15-gfa.jar:2.3.15-gfa]
    at org.glassfish.grizzly.AbstractSocketConnectorHandler.connect(AbstractSocketConnectorHandler.java:91) ~[grizzly-framework-2.3.15-gfa.jar:2.3.15-gfa]
    at org.glassfish.grizzly.AbstractSocketConnectorHandler.connect(AbstractSocketConnectorHandler.java:79) ~[grizzly-framework-2.3.15-gfa.jar:2.3.15-gfa]
    at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientSocket._connect(GrizzlyClientSocket.java:386) ~[tyrus-container-grizzly-client-1.11.jar:?]
    ... 41 more

The NPE is coming from line if (runners.length == 1) { in grizzly project file RoundRobinConnectionDistributor.java:146

private final class SharedIterator implements Iterator {
private final AtomicInteger counter = new AtomicInteger();

@Override
public SelectorRunner next() {
    final SelectorRunner[] runners = getTransportSelectorRunners();
    if (runners.length == 1) {
        return runners[0];
    }

    return runners[(counter.getAndIncrement() & 0x7fffffff) % runners.length];
}

@Override
public SelectorRunner nextService() {
    return next();
}
}

Looks like that runners is null.

回答1:

For anyone interested this issue is fixed in Tyrus 1.12 by this pull request. Thanks to @pavel and his team for taking action :)