Copy or rsync command

2020-05-12 05:21发布

问题:

The following command is working as expected...

cp -ur /home/abc/* /mnt/windowsabc/

Does rsync has any advantage over it? Is there a better way to keep to backup folder in sync every 24 hours?

回答1:

Rsync is better since it will only copy only the updated parts of the updated file, instead of the whole file. It also uses compression and encryption if you want. Check out this tutorial.



回答2:

rsync is not necessarily more efficient, due to the more detailed inventory of files and blocks it performs. The algorithm is fantastic at what it does, but you need to understand your problem to know if it is really going to be the best choice.

On a very large file system (say many thousands or millions of files) where files tend to be added but not updated, "cp -u" will likely be more efficient. cp makes the decision to copy solely on metadata and can simply get to the business of copying.

Note that you might want some buffering, e.g. by using tar rather than straight cp, depending on the size of the files, network performance, other disk activity, etc. I find the following idea very useful:

tar cf - . | tar xCf directory -

Metadata itself may actually become a significant overhead on very large (cluster) file systems, but rsync and cp will share this problem.

rsync seems to frequently be the preferred tool (and in general purpose applications is my usual default choice), but there are probably many people who blindly use rsync without thinking it through.



回答3:

The command as written will create new directories and files with the current date and time stamp, and yourself as the owner. If you are the only user on your system and you are doing this daily it may not matter much. But if preserving those attributes matters to you, you can modify your command with

cp -pur /home/abc/* /mnt/windowsabc/

The -p will preserve ownership, timestamps, and mode of the file. This can be pretty important depending on what you're backing up.

The alternative command with rsync would be

rsync -avh /home/abc/* /mnt/windowsabc

With rsync, -a indicates "archive" which preserves all those attributes mentioned above. -v indicates "verbose" which just lists what it's doing with each file as it runs. -z is left out here for local copies, but is for compression, which will help if you are backing up over a network. Finally, the -h tells rsync to report sizes in human-readable formats like MB,GB,etc.

Out of curiosity, I ran one copy to prime the system and avoid biasing against the first run, then I timed the following on a test run of 1GB of files from an internal SSD drive to a USB-connected HDD. These simply copied to empty target directories.

cp -pur    : 19.5 seconds
rsync -ah  : 19.6 seconds
rsync -azh : 61.5 seconds

Both commands seem to be about the same, although zipping and unzipping obviously tax the system where bandwidth is not a bottleneck.



回答4:

Especially if you use a copy-on-write filesystem like BTRFS or ZFS, rsync is much better.

I use BTRFS, and I have this in my ~/.bashrc:

alias cp="rsync -ah --inplace --no-whole-file --info=progress2"

The important flag here for CoW FSs like BTRFS is --inplace because it only copies the changed part of the files, doesn't create new for small changes between files inodes, etc. See this.



回答5:

For a local copy, the only advantage of rsync is that it will avoid copying if the file already exists in the destination directory. The definition of "already exists" is (a) same file name (b) same size (c) same timestamp. (Maybe same owner/group; I am not sure...)

The "rsync algorithm" is great for incremental updates of a file over a slow network link, but it will not buy you much for a local copy, as it needs to read the existing (partial) file to run it's "diff" computation.

So if you are running this sort of command frequently, and the set of changed files is small relative to the total number of files, you should find that rsync is faster than cp. (Also rsync has a --delete option that you might find useful.)



回答6:

It's not really a question of what's more efficient.

The commands 'rsync', and 'cp' are not equivalent and achieve different goals.

1- rsync can preserve the time of creation of existing files. (using -a option)
2- rsync will run multiprocess and transfer using either local sockets or network sockets. (i.e. fork itself into multiple processes)
3- The multiprocessing, and threading will increase your throughput when copying large number of small files, and even with multiple larger files.

So bottom line is rsync is for large data, and cp is for smaller local copying. (MB to small GB range). When you start getting into multiple GB or in the TB range, go with rsync. And of course network copies, rsync all the way.



回答7:

Keep in mind that while transferring files internally on a machine i.e not network transfer, using the -z flag can have a massive difference in the time taken for the transfer.

Transfer within same machine

Case 1: With -z flag:
    TAR took: 9.48345208168
    Encryption took: 2.79352903366
    CP took = 5.07273387909
    Rsync took = 30.5113282204

Case 2: Without the -z flag:
    TAR took: 10.7535531521
    Encryption took: 3.0386879921
    CP took = 4.85565590858
    Rsync took = 4.94515299797


回答8:

if you are using cp doesn't save existing files when copying folders of the same name. Lets say you have this folders:

/myFolder
  someTextFile.txt

/someOtherFolder
  /myFolder
    wellHelloThere.txt

Then you copy one over the other:

cp /someOtherFolder/myFolder /myFolder

result:

/myFolder
  wellHelloThere.txt

This is at least what happens on macOS and I wanted to preserve the diff files so I used rsync.



标签: rsync cp