After I accept()
a connection, and then write()
to the client socket, is it better to write all the data you intend to send at once or send it in chunks?
For example:
accept, write 1MB, disconnect
…or…
accept, write 256 bytes, write 256 bytes, … n, disconnect
My gut feeling tells me that the underlying protocol does this automatically, with error correction, etc. Is this correct, or should I be chunking my data?
Before you ask, no I'm not sure where I got the idea to chunk the data – I think it's an instinct I've picked up from programming C# web services (to get around receive buffer limits, etc, I think). Bad habit?
Note: I'm using C
The client and server will break up your data as they see fit, so you can send as much as you like in one chunk. Check A User's Guide to TCP Windows article by Von Welch.
Years and years ago, I had an application that send binary data - it did one send with the size of the following buffer, and then another send with the buffer (a few hundred bytes). And after profiling, we discovered that we could get a major speed-up by making them into one buffer, and sending it just once. We were surprised - even though there is some network overhead on each packet, we didn't think that was going to be a noticeable factor.
The Nagle Algorithm, which is usually enabled by default on TCP sockets, will likely combine those four 256 byte writes into the same packet. So it really doesn't matter if you send it as one write or several, it should end up in one packet anyways. Sending it as one chunk makes more sense if you have a big chunk to begin with.
From a TCP level, yes your big buffer will be split up when it is too large, and it will be combined when it is too small.
From an application level, don't let your application deal with unbounded buffer sizes. At some level you need to split them up.
If you are sending a file over a socket, and perhaps processing some of this file's data, like compressing it. Then you need to split this up into chunks. Otherwise you will use too much RAM when you eventually happen upon a large file and your program will be out of RAM.
RAM is not the only problem. If your buffer gets too big, you may spend too much time reading in the data, or processing it, and you won't be using the socket that is sitting there waiting for data. For this reason it's best to have a parameter for the buffer size so that you can determine a value that is not too small, nor too big.
My claim is not that a TCP socket can't handle a big chunk of data, it can and I suggest to use bigger buffers when sending to get better efficiency. My claim is to just don't deal with unbounded buffer sizes in your application.
If you're computing the data between those writes, it may be better to stream them as they're available. Also, writing them all at once may produce buffer overruns (though that's probably rare, it does happen), meaning that your app needs to pause and re-try the writes (not all of them, just from the point where you hit the overflow.)
I wouldn't usually go out of my way to chunk the writes, especially not as small as 256 byte chunks. (Since roughly 1500 bytes can fit in an Ethernet packet after TCP/IP overhead, I'd use chunks at least that large.)
I would send all in one big chunk as the underlying layers in osi modell . Therefor you dont have to worry about how big chunks you are sending as the layers will split these up as necisarry.
The only absolute answer is to profile app in case. There are so many factors that it is not possible to give exact answer thah is correct in all cases.