I'm trying to figure out the best way to transfer large amounts of data over a network between two systems. I am currently looking into either FTP, HTTP, or RSync, and I am wondering which one is the fastest. I've looked online for some answers and found the following sites:
The problem is that these are old, and talk more about the theoretical differences between how the protocols communicate. I am more interested with actual benchmarks, that can say that for a specific setup, when transferring files of varying sizes one protocol is x% faster then the others.
Has anyone test these and posted the results somewhere?
I'm afraid if you want to know the answer for your needs and setup, you either have to be more specific or do your own performance (and reliability) tests. It does help to have an at least rudimentary understanding of the protocols in question and their communication, so I'd consider the articles you've been quoting a helpful resource. It also helps to know which restrictions the early inventors of these protocols faced - was their aim to keep network impact low, were they memory-starved, or did they have to count their cpu-cycles? Here's a few things to consider or answer if you want to get an answer tailored to your situation:
Lots of things to consider, and I'm sure the listing isn't even complete.
If the machines at each end are reasonably powerful (ie not netbooks, NAS boxes, toasters, etc), then I would expect all protocols which work over TCP to be much the same speed at transferring bulk data. The application protocol's job is really just to fill a buffer for TCP to transfer, so as long as they can keep it full, TCP will set the pace.
Protocols which do compression or encryption may bottleneck at the CPU on less powerful machines. My netbook does FTP much faster than SCP.
rsync does clever things to transmit incremental changes quickly, but for bulk transfers it has no advantage over dumber protocols.
rsync optionally compresses its data. That typically makes the transfer go much faster. See rsync -z.
You didn't mention scp, but scp -C also compresses.
Do note that compression might make the transfer go faster or slower, depending upon the speed of your CPU and of your network link. (Slower links and faster CPU make compression a good idea; faster links and slower CPU make compression a bad idea.) As with any optimization, measure the results in your own environment.
Another utility to consider is bbcp : http://www.slac.stanford.edu/~abh/bbcp/.
A good, but dated, tutorial to using it is here: http://pcbunn.cithep.caltech.edu/bbcp/using_bbcp.htm . I have found that bbcp is extremely good at transferring large files (multiple GBs). In my experience, it is faster than rsync on average.
Alright, so I setup the following test:
I uploaded the following groups of files to each server:
I got the following average results over multiple runs (numbers in seconds):
So, it seems that FTP is slightly faster in large files, and HTTP is a little faster in many small files. All in all, I think that they are comparable, and the server implementation is much more important then the protocol.