infuriating heisenbug in Java/WAS 8.5 liberty prof

2019-07-15 12:31发布

问题:

I get socket bind error on Java 6/Websphere 8.5 (Liberty profile, a cut down, usable version of Websphere). When killing and starting app server immediately again I get:

[ERROR ] CWWKO0221E: TCP Channel defaultHttpEndpoint initialization did not succeed. The socket bind did not succeed for host * and port 9988. The port might already be in use.

This is because either Java or WAS have not released IPv6 sockets properly.

But, here's the snag: when I run WLP via strace (with -f option to track child processes), the bind error does NOT happen.

WT. is going on? Why can't I catch this via strace?

I can get around this problem by specifying soReuseAddress, but what worries me here is why / how to catch this problem via strace (without relying on dumb luck, that is) and why it's not working?

回答1:

You may find adding the soReuseAddr option to your httpEndpoint configuration helps, particularly on Linux platforms. For example,

<httpEndpoint id="defaultHttpEndpoint"
             host="*"
             httpPort="9080">
      <tcpOptions soReuseAddr="true" />
 </httpEndpoint>

It can take a while for the OS to release ports, despite the best attempts of the server, and this is particularly noticeable with Liberty, since it tends to bounce quickly.