TCP standard has "simultaneous open" feature.
The implication of the feature, client trying to connect to local port, when the port is from ephemeral range, can occasionally connect to itself (see here).
So client think it's connected to server, while it actually connected to itself. From other side, server can not open its server port, since it's occupied/stolen by client.
I'm using RHEL 5.3 and my clients constantly tries to connect to local server. Eventually client connects to itself.
I want to prevent the situation. I see two possible solutions to the problem:
- Don't use ephemeral ports for server ports. Agree ephemeral port range and configure it on your machines (see ephemeral range)
- Check connect() as somebody propose here.
What do you thinks? How do you handle the issue?
P.S. 1
Except of the solution, which I obviously looking for, I'd like you to share your real life experience with the problem.
When I found the cause of the problem, I was "astonished" on my work place people are not familiar with it. Polling server by connecting it periodically is IMHO common practice, so how it's that the problem is not commonly known.
Hmm, that is an odd problem. If you have a client / server on the same machine and it will always be on the same machine perhaps shared memory or a Unix domain socket or some other form of IPC is a better choice.
Other options would be to run the server on a fixed port and the client on a fixed source port. Say, the server runs on 5000 and the client runs on 5001. You do have the issue of binding to either of these if something else is bound to them.
You could run the server on an even port number and force the client to an odd port number. Pick a random number in the ephemeral range, OR it with 1, and then call bind() with that. If bind() fails with EADDRINUSE then pick a different odd port number and try again.
This option isn't actually implemented in most TCPs. Do you have an actual problem?
In my opinion, this is a bug in the TCP spec; listening sockets shouldn't be able to send unsolicited SYNs, and receiving a SYN (rather than a SYN+ACK) after you've sent one should be illegal and result in a reset, which would quickly let the client close the unluckily-chosen local port. But nobody asked for my opinion ;)
As you say, the obvious answer is not to listen in the ephemeral port range. Another solution, if you know you'll be connecting to a local machine, is to design your protocol so that the server sends the first message, and have a short timeout on the client side for receiving that message.
That's an interesting issue! If you're mostly concerned that your server is running, you could always implement a heartbeat mechanism in the server itself to report status to another process. Or you could write a script to check and see if your server process is running.
If you're concerned more about the actual connection to the server being available, I'd suggest moving your client to a different machine. This way you can verify that your server at least has some network connectivity.
When I stumbled into this I was flabbergasted. I could figure out that the outgoing port number accidentally matches the incoming port number, but not why the TCP handshake (SYN SYN-ACK ACK) would succeed (ask yourself: who is sending the ACK if there is nobody doing a listen() and accept()???)
Both Linux and FreeBSD show this behavior.
Anyway, one solution is to stay out of the high range of port numbers for servers.
I noticed that Darwin side-steps this issue by not allowing the outgoing port to be the same as the destination port. They must have been bitten by this as well...
An easy way to show this effect is as follows:
And wait for a minute or so and you will be chatting with yourself...
Anyway, it makes good job interview material.
Bind the client socket to port 0 (system assigns), check the system assigned port, if it matches the local server port you already know the server is down and and can skip connect().