I'd like to know the general cost of creating a new connection, compared to UDP. I know TCP requires an initial exchange of packets (the 3 way handshake). What would be other costs? For instance is there some sort of magic in the kernel needed for setting up buffers etc?
The reason I'm asking is I can keep an existing connection open and reuse it as needed. However if there is little overhead reconnecting it would reduce complexity.
OPTION 1: The general cost of creating a TCP connection are:
Step 1: Requires an exchange of packets, so it's delayed by to & from network latency plus the destination server's service time. No significant CPU usage on either box is involved.
Step 2: Depends on the size of the message.
Step 3: IIRC, just sends a 'closing now' packet, w/ no wait for destination ack, so no latency involved.
OPTION 2: Costs of UDP:*
Step 1: Requires minimal setup, no latency worries, very fast.
Step 2: BE CAREFUL OF SIZE, there is no retransmit in UDP since it doesn't care if the packet was received by anyone or not. I've heard that the larger the message, the greater probability of data being received corrupted, and that a rule of thumb is that you'll lose a certain percentage of messages over 20 MB.
Step 3: Minimal work, minimal time.
OPTION 3: Use ZeroMQ Instead
You're comparing TCP to UDP with a goal of reducing reconnection time. THERE IS A NICE COMPROMISE: ZeroMQ sockets.
ZMQ allows you to set up a publishing socket where you don't care if anyone is listening (like UDP), and have multiple listeners on that socket. This is NOT a UDP socket - it's an alternative to both of these protocols.
See: ZeroMQ.org for details.
It's very high speed and fault tolerant, and is in increasing use in the financial industry for those reasons.
Compared to the latency of the packet exchange, all other costs such as kernel setup times are insignificant.
Once a UDP packet's been dumped onto the wire, the UDP protocol stack is free to completely forget about it. With TCP, there's at bare minimum the connection details (source/dest port and source/dest IP), the sequence number, the window size for the connection etc... It's not a huge amount of data, but adds up quickly on a busy server with many connections.
And then there's the 3-way handshake as well. Some braindead (and/or malicious systems) can abuse the process (look up 'syn flood'), or just drop the connection on their end, leaving your system waiting for a response or close notice that'll never come. The plus side is that with TCP the system will do its best to make sure the packet gets where it has to. With UDP, there's no guarantees at all.