可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I manually copy a file to a server, and the same one to an SFTP server.
The file is 140MB.
FTP: I have a rate arround 11MB/s
SFTP: I have a rate arround 4.5MB/s
I understand the file has to be encrypted before being sent. Is it the only impact on the file transfer? (and actually this is not exactly transfer time, but encryption time).
I am suprised of such results.
回答1:
I'm the author of HPN-SSH and I was asked by a commenter here to weigh in. I'd like to start with a couple of background items. First off, it's important to keep in mind that SSHv2 is a multiplexed protocol - multiple channels over a single TCP connection. As such, the SSH channels are essentially unaware of the underlying flow control algorithm used by TCP. This means that SSHv2 has to implement its own flow control algorithm. The most common implementation basically reimplements sliding windows. The means that you have the SSH sliding window riding on top of the TCP sliding window. The end results is that the effective size of the receive buffer is the minimum of the receive buffers of the two sliding windows. Stock OpenSSH has a maximum receive buffer size of 2MB but this really ends up being closer to ~1.2MB. Most modern OSes have a buffer that can grow (using auto-tuning receive buffers) up to an effective size of 4MB. Why does this matter? If the receive buffer size is less than the bandwidth delay product (BDP) then you will never be able to fully fill the pipe regardless of how fast your system is.
This is complicated by the fact that SFTP adds another layer of flow control onto of the TCP and SSH flow controls. SFTP uses a concept of outstanding messages. Each message may be a command, a result of a command, or bulk data flow. The outstanding messages may be up to a specific datagram size. So you end up with what you might as well think of as yet another receive buffer. The size of this receive buffer is datagram size * maximum outstanding messages (both of which may be set on the command line). The default is 32k * 64 (2MB). So when using SFTP you have to make sure that the TCP receive buffer, the SSH receive buffer, and the SFTP receive buffer are all of sufficient size (without being too large or you can have over buffering problems in interactive sessions).
HPN-SSH directly addresses the SSH buffer problem by having a maximum buffer size of around 16MB. More importantly, the buffer dynamically grows to the proper size by polling the proc entry for the TCP connection's buffer size (basically poking a hole between layers 3 and 4). This avoids overbuffering in almost all situations. In SFTP we raise the maximum number of outstanding requests to 256. At least we should be doing that - it looks like that change didn't propagate as expected to the 6.3 patch set (though it is in 6.2. I'll fix that soon). There isn't a 6.4 version because 6.3 patches cleanly against 6.4 (which is a 1 line security fix from 6.3). You can get the patch set from sourceforge.
I know this sounds odd but right sizing the buffers was the single most important change in terms of performance. In spite of what many people think the encryption is not the real source of poor performance in most cases. You can prove this to yourself by transferring data to sources that are increasingly far away (in terms of RTT). You'll notice that the longer the RTT the lower the throughput. That clearly indicates that this is an RTT dependent performance problem.
Anyway, with this change I started seeing improvements of up to 2 orders of magnitude. If you understand TCP you'll understand why this made such a difference. It's not about the size of the datagram or the number of packets or anything like that. It's entire because in order to make efficient use of the network path you must have a receive buffer equal to the amount of data that can be in transit between the two hosts. This also means that you may not see any improvement whatsoever if the path isn't sufficiently fast and long enough. If the BDP is less than 1.2MB HPN-SSH may be of no value to you.
The parallelized AES-CTR cipher is a performance boost on systems with multiple cores if you need to have full encryption end to end. Usually I suggest people (or have control over both the server and client) to use the NONE cipher switch (encrypted authentication, bulk data passed in clear) as most data isn't all that sensitive. However, this only works in non-interactive sessions like SCP. It doesn't work in SFTP.
There are some other performance improvements as well but nothing as important as the right sizing of the buffers and the encryption work. When I get some free time I'll probably pipeline the HMAC process (currently the biggest drag on performance) and do some more minor optimization work.
So if HPN-SSH is so awesome why hasn't OpenSSH adopted it? That's a long story and people who know the OpenBSD team probably already know the answer. I understand many of their reasons - it's a big patch which would require additional work on their end (and they are a small team), they don't care as much about performance as security (though there is no security implications to HPN-SSH), etc etc etc. However, even though OpenSSH doesn't use HPN-SSH Facebook does. So do Google, Yahoo, Apple, most ever large research data center, NASA, NOAA, the government, the military, and most financial institutions. It's pretty well vetted at this point.
If anyone has any questions feel free to ask but I may not be keeping up to date on this forum. You can always send me mail via the HPN-SSH email address (google it).
回答2:
UPDATE: As a commenter pointed out, the problem I outline below was fixed some time before this post. However, I knew of the HP-SSH project and I asked the author to weigh in. As they explain in the (rightfully) most upvoted answer, encryption is not the source of the problem. Yay for email and people smarter than myself!
Wow, a year-old question with nothing but incorrect answers. However, I must admit that I assumed the slowdown was due to encryption when I asked myself the same question. But ask yourself the next logical question: how quickly can your computer encrypt and decrypt data? If you think that rate is anywhere near the 4.5Mb/second reported by the OP (.5625MBs or roughly half the capacity of a 5.5" floppy disk!) smack yourself a few times, drink some coffee, and ask yourself the same question again.
It apparently has to do with what amounts to be an oversight in the packet size selection, or at least that's what the author of LIBSSH2 says,
The nature of SFTP and its ACK for every small data chunk it sends, makes an initial naive SFTP implementation suffer badly when sending data over high latency networks. If you have to wait a few hundred milliseconds for each 32KB of data then there will never be fast SFTP transfers. This sort of naive implementation is what libssh2 has offered up until and including libssh2 1.2.7.
So the speed hit is due to tiny packet sizes x mandatory ack responses for each packet, which is clearly insane.
The High Performance SSH/SCP (HP-SSH) project provides an OpenSSH patch set which apparently improves the internal buffers as well as parallelizing encryption. Note, however, that even the non-parallelized versions ran at speeds above the 40Mb/s unencrypted speeds obtained by some commenters. The fix involves changing the way in which OpenSSH was calling the encryption libraries, NOT the cipher and there is zero difference in speed between AES128 and AES256. Encryption takes some time, but it is marginal. It might have mattered back in the 90's but (like the speed of Java vs C) it just doesn't matter anymore.
回答3:
Several factors affect speed of SFTP transfer:
- Encryption. Though symmetric encryption is fast, it's not that fast to be unnoticed. If you comparing speeds on fast network (100mbit or larger), encryption becomes a break for your process.
- Hash calculation and checking.
- Buffer copying. SFTP running on top of SSH causes each data block to be copied at least 6 times (3 times on each side) more comparing to plain FTP where data in best cases can be passed to network interface without being copied at all. And block copy takes a bit of time as well.
回答4:
Encryption has not only cpu, but also some network overhead.
回答5:
There are all sorts of things which can do this. One possiblity is "Traffic Shaping". This is commonly done in office environments to reserve bandwidth for business critical activities. It may also be done by the web hosting company, or by your ISP, for very similar reasons.
You can also set it up at home very simply.
For example there may be a rule reserving minimum bandwidth for FTP, while SFTP might be falling under an "everything else" rule. Or there might be a rule capping bandwidth for SFTP, but someone else is also using SFTP at the same time as you.
So: Where are you tranferring the file from and to?
回答6:
SFTP is not FTP over SSH, it's a different protocol and being similar to SCP, it's offers more capabilities.
回答7:
For comparison, I tried transfering a 299GB ntfs disk image from an i5 laptop running Raring Ringtail Ubuntu alpha 2 live cd to an i7 desktop running Ubuntu 12.04.1. Reported speeds:
over wifi + powerline:
scp: 5MB/sec (40 Mbit/sec)
over gigabit ethernet + netgear G5608 v3:
scp: 44MB/sec
sftp: 47MB/sec
sftp -C: 13MB/sec
So, over a good gigabit link, sftp is slightly faster than scp, 2010-era fast CPUs seem fast enough to encrypt,
but compression isn't a win in all cases.
Over a bad gigabit ethernet link, though, I've had sftp far outperform scp. Something about scp being very chatty, see "scp UNBELIEVABLY slow" on comp.security.ssh from 2008:
https://groups.google.com/forum/?fromgroups=#!topic/comp.security.ssh/ldPV3msFFQw
http://fixunix.com/ssh/368694-scp-unbelievably-slow.html
回答8:
Your results make sense. Since FTP operates over a non-encrypted channel it is faster than SFTP (which is subsystem on top of the SSH version 2 protocol). Also remember that SFTP is a packet based protocol unlike FTP which is command based.
Each packet in SFTP is encrypted before being written to the outgoing socket from the client and subsequently decrypted when received by the server. This of-course leads to slow transfer rates but very secure transfer. Using compression such as zlib with SFTP improves the transfer time but still it won't be anywhere near plain text FTP. Perhaps a better comparison is to compare SFTP with FTPS which both use encryption?
Speed for SFTP depends on the cipher used for encryption/decryption, the compression used e.g. zlib, packet sizes and buffer sizes used for the socket connection.
回答9:
Yes, encryption add some load to your cpu, but if your cpu is not ancient that should not affect as much as you say.
If you enable compression for SSH, SCP is actually faster than FTP despite the SSH encryption (if I remember, twice as fast as FTP for the files I tried). I haven't actually used SFTP, but I believe it uses SCP for the actual file transfer. So please try this and let us know :-)