I can reliably get a Winsock socket to connect()
to itself if I connect to localhost with a port in the range of automatically assigned ephemeral ports (5000–65534). Specifically, Windows appears to have a system-wide rolling port number which is the next port that it will try to assign as a local port number for a client socket. If I create sockets until the assigned number is just below my target port number, and then repeatedly create a socket and attempt to connect to that port number, I can usually get the socket to connect to itself.
I first got it to happen in an application that repeatedly tries to connect to a certain port on localhost, and when the service is not listening it very rarely successfully establishes a connection and receives the message that it initially sent (which happens to be a Redis PING
command).
An example, in Python (run with nothing listening to the target port):
import socket
TARGET_PORT = 49400
def mksocket():
return socket.socket(socket.AF_INET, socket.SOCK_STREAM, socket.IPPROTO_TCP)
while True:
sock = mksocket()
sock.bind(('127.0.0.1', 0))
host, port = sock.getsockname()
if port > TARGET_PORT - 10 and port < TARGET_PORT:
break
print port
while port < TARGET_PORT:
sock = mksocket()
err = None
try:
sock.connect(('127.0.0.1', TARGET_PORT))
except socket.error, e:
err = e
host, port = sock.getsockname()
if err:
print 'Unable to connect to port %d, used local port %d: %s' % (TARGET_PORT, port, err)
else:
print 'Connected to port %d, used local port %d' (TARGET_PORT, port)
On my Mac machine, this eventually terminates with Unable to connect to port 49400, used local port 49400
. On my Windows 7 machine, a connection is successfully established and it prints Connected to port 49400, used local port 49400
. The resulting socket receives any data that is sent to it.
Is this a bug in Winsock? Is this a bug in my code?
Edit: Here is a screenshot of TcpView with the offending connection shown:
It is a logic bug in your code.
First off, only newer versions of Windows use 5000–65534 as ephemeral ports. Older versions used 1025-5000 instead.
You are creating multiple sockets that are explicitly bound to random ephemeral ports until you have bound a socket that is within 10 ports less than your target port. However, if any of those sockets happen to actually bind to the actual target port, you ignore that and keep looping. So you may or may end up with a socket that is bound to the target port, and you may or may not end up with a final
port
value that is actually less than the target port.After that, if
port
happens to be less than your target port (which is not guaranteed), you are then creating more sockets that are implicitly bound to different random available ephemeral ports when callingconnect()
(it does an implicitbind()
internally ifbind()
has not been called yet), none of which will be the same ephemeral ports that you explicitly bound to since those ports are already in use and cannot be used again.At no point do you have any given socket connecting from an ephemeral port to the same ephemeral port. And unless another app happens to have bound itself to your target port and is actively listening on that port, then there is no way that
connect()
can be successfully connecting to the target port on any of the sockets you create, since none of them are in the listening state. Andgetsockname()
is not valid on an unbound socket, and a connecting socket is not guaranteed to be bound ifconnect()
fails. So the symptoms you think are happening are actually physically impossible given the code you have shown. Your logging is simply making the wrong assumptions and thus is logging the wrong things, giving you a false state of being.Try something more like this instead, and you will see what the real ports are:
This appears to be a 'simultaneous initiation' as described in #3.4 of RFC 793. See Figure 8. Note that neither side is in state LISTEN at any stage. In your case, both ends are the same: that would cause it to work exactly as described in the RFC.