TCP/UDP and ethernet MTU Fragmentation

2019-03-31 19:28发布

问题:

I've read various sites and tutorials online but I am still confused. If the message is bigger than IP MTU, then send() returns the byte sent. What happens to the rest of the message? Am I to call send() again and attempt to send rest of the message? Or is that something IP layer should take care of automatically?

回答1:

If you are using TCP then the interface presented to you is that of a stream of bytes. You don't need to worry about how the stream of bytes gets from one end of the connection to the other. You can ignore the IP layer's MTU. In fact you can ignore the IP layer entirely.

When you call send() the TCP stack on your machine will deal with all the details necessary for the stream of bytes that you are pushing into your send calls to appear from recv() calls at the other end of the connection.

The one thing to remember is that with TCP you are dealing with a stream and that means that one send() may result in data arriving in multiple recv() calls and multiple send() calls may result in data arriving in a single recv() call. You have no control over this. You are dealing with a stream of bytes and each call to recv() can return any number of bytes from 1 to the number currently outstanding (allowing for adequate buffers passed to the recv() call).

Since the commenters asked for it ;)

On most TCP stacks send() is most likely to fail to send everything because the TCP stack's buffers are full and (probably) the TCP window is also full and flow control is in operation which means that the stack can't send any more data until the remote end ACKs some data and it's not prepared to buffer any more on your behalf. I've not come across a TCP stack that will refuse a send() due to MTU considerations alone but I guess some slimmed down embedded systems might behave that way...

Anyway, if send() returns less than the number of bytes that you supplied it then you should resend the remaining data at some point. Often send() will block and wait until it can send all of the data, and if you've set the socket into non blocking mode then you probably do NOT want to immediately retry the send if it fails to send everything as you'll likely end up in a tight loop...

It would probably be useful for you to be more specific about the operating system that you're using.



回答2:

If the packet is too large to transit the network an ICMP fragmentation hint is sent signaling the sender to reduce the packet size and try again.

If you use TCP these are all details you should expect the network layer to take care of for you. What modern IP stacks actually do behind the scenes to figure out the lowest MTU along the path seems to have become somewhat of a black art.

WRT UDP you can still expect the stack to fragment for you but practically given the use case for UDP its not ideal.. depending on your application you are likely to see better performance by explicitly understanding the path MTU.

... on the send() question some stacks behave differently but the treatment WRT your code should be the same. Lets say you have 100 bytes to send... send() returns 10 bytes sent. You need to keep calling send with the remaining 90 bytes until its all pushed out the wire to send the entire message.

Using blocking sockets on the windows platform send() will ususally return after everything is sent.. On other platforms.. Linux et al you will need to keep sending more often to push the data.