Why when I transfer a file through SFTP, it takes

2019-03-07 21:15发布

I manually copy a file to a server, and the same one to an SFTP server. The file is 140MB.

FTP: I have a rate arround 11MB/s

SFTP: I have a rate arround 4.5MB/s

I understand the file has to be encrypted before being sent. Is it the only impact on the file transfer? (and actually this is not exactly transfer time, but encryption time).

I am suprised of such results.

9条回答
做个烂人
2楼-- · 2019-03-07 21:33

SFTP is not FTP over SSH, it's a different protocol and being similar to SCP, it's offers more capabilities.

查看更多
beautiful°
3楼-- · 2019-03-07 21:37

Yes, encryption add some load to your cpu, but if your cpu is not ancient that should not affect as much as you say.

If you enable compression for SSH, SCP is actually faster than FTP despite the SSH encryption (if I remember, twice as fast as FTP for the files I tried). I haven't actually used SFTP, but I believe it uses SCP for the actual file transfer. So please try this and let us know :-)

查看更多
该账号已被封号
4楼-- · 2019-03-07 21:43

Encryption has not only cpu, but also some network overhead.

查看更多
叛逆
5楼-- · 2019-03-07 21:43

There are all sorts of things which can do this. One possiblity is "Traffic Shaping". This is commonly done in office environments to reserve bandwidth for business critical activities. It may also be done by the web hosting company, or by your ISP, for very similar reasons.

You can also set it up at home very simply.

For example there may be a rule reserving minimum bandwidth for FTP, while SFTP might be falling under an "everything else" rule. Or there might be a rule capping bandwidth for SFTP, but someone else is also using SFTP at the same time as you.

So: Where are you tranferring the file from and to?

查看更多
混吃等死
6楼-- · 2019-03-07 21:48

I'm the author of HPN-SSH and I was asked by a commenter here to weigh in. I'd like to start with a couple of background items. First off, it's important to keep in mind that SSHv2 is a multiplexed protocol - multiple channels over a single TCP connection. As such, the SSH channels are essentially unaware of the underlying flow control algorithm used by TCP. This means that SSHv2 has to implement its own flow control algorithm. The most common implementation basically reimplements sliding windows. The means that you have the SSH sliding window riding on top of the TCP sliding window. The end results is that the effective size of the receive buffer is the minimum of the receive buffers of the two sliding windows. Stock OpenSSH has a maximum receive buffer size of 2MB but this really ends up being closer to ~1.2MB. Most modern OSes have a buffer that can grow (using auto-tuning receive buffers) up to an effective size of 4MB. Why does this matter? If the receive buffer size is less than the bandwidth delay product (BDP) then you will never be able to fully fill the pipe regardless of how fast your system is.

This is complicated by the fact that SFTP adds another layer of flow control onto of the TCP and SSH flow controls. SFTP uses a concept of outstanding messages. Each message may be a command, a result of a command, or bulk data flow. The outstanding messages may be up to a specific datagram size. So you end up with what you might as well think of as yet another receive buffer. The size of this receive buffer is datagram size * maximum outstanding messages (both of which may be set on the command line). The default is 32k * 64 (2MB). So when using SFTP you have to make sure that the TCP receive buffer, the SSH receive buffer, and the SFTP receive buffer are all of sufficient size (without being too large or you can have over buffering problems in interactive sessions).

HPN-SSH directly addresses the SSH buffer problem by having a maximum buffer size of around 16MB. More importantly, the buffer dynamically grows to the proper size by polling the proc entry for the TCP connection's buffer size (basically poking a hole between layers 3 and 4). This avoids overbuffering in almost all situations. In SFTP we raise the maximum number of outstanding requests to 256. At least we should be doing that - it looks like that change didn't propagate as expected to the 6.3 patch set (though it is in 6.2. I'll fix that soon). There isn't a 6.4 version because 6.3 patches cleanly against 6.4 (which is a 1 line security fix from 6.3). You can get the patch set from sourceforge.

I know this sounds odd but right sizing the buffers was the single most important change in terms of performance. In spite of what many people think the encryption is not the real source of poor performance in most cases. You can prove this to yourself by transferring data to sources that are increasingly far away (in terms of RTT). You'll notice that the longer the RTT the lower the throughput. That clearly indicates that this is an RTT dependent performance problem.

Anyway, with this change I started seeing improvements of up to 2 orders of magnitude. If you understand TCP you'll understand why this made such a difference. It's not about the size of the datagram or the number of packets or anything like that. It's entire because in order to make efficient use of the network path you must have a receive buffer equal to the amount of data that can be in transit between the two hosts. This also means that you may not see any improvement whatsoever if the path isn't sufficiently fast and long enough. If the BDP is less than 1.2MB HPN-SSH may be of no value to you.

The parallelized AES-CTR cipher is a performance boost on systems with multiple cores if you need to have full encryption end to end. Usually I suggest people (or have control over both the server and client) to use the NONE cipher switch (encrypted authentication, bulk data passed in clear) as most data isn't all that sensitive. However, this only works in non-interactive sessions like SCP. It doesn't work in SFTP.

There are some other performance improvements as well but nothing as important as the right sizing of the buffers and the encryption work. When I get some free time I'll probably pipeline the HMAC process (currently the biggest drag on performance) and do some more minor optimization work.

So if HPN-SSH is so awesome why hasn't OpenSSH adopted it? That's a long story and people who know the OpenBSD team probably already know the answer. I understand many of their reasons - it's a big patch which would require additional work on their end (and they are a small team), they don't care as much about performance as security (though there is no security implications to HPN-SSH), etc etc etc. However, even though OpenSSH doesn't use HPN-SSH Facebook does. So do Google, Yahoo, Apple, most ever large research data center, NASA, NOAA, the government, the military, and most financial institutions. It's pretty well vetted at this point.

If anyone has any questions feel free to ask but I may not be keeping up to date on this forum. You can always send me mail via the HPN-SSH email address (google it).

查看更多
贼婆χ
7楼-- · 2019-03-07 21:48

Your results make sense. Since FTP operates over a non-encrypted channel it is faster than SFTP (which is subsystem on top of the SSH version 2 protocol). Also remember that SFTP is a packet based protocol unlike FTP which is command based.

Each packet in SFTP is encrypted before being written to the outgoing socket from the client and subsequently decrypted when received by the server. This of-course leads to slow transfer rates but very secure transfer. Using compression such as zlib with SFTP improves the transfer time but still it won't be anywhere near plain text FTP. Perhaps a better comparison is to compare SFTP with FTPS which both use encryption?

Speed for SFTP depends on the cipher used for encryption/decryption, the compression used e.g. zlib, packet sizes and buffer sizes used for the socket connection.

查看更多
登录 后发表回答