Replicate Filesystem with Rsync

       277 words, 2 minutes

Quick note on using rsync(1) to move data from one place to another.

For some reasons, I am moving my backup storage volume from one place to another. It is quite big and it takes more than 24H to copy. And as I don’t want to stop running the backup in the meantime, I have to be able to resynchronize the whole set of data several times. So I decided to use rsync(1).

One special thing about my backup repository is that it is maintained by rsnapshot(1). Which means that the repo is full of hard links ; as this is how rsnapshot(1) maintains a full set of files without duplicating the storage space.

By default, rsync -a implies -rlptgoD. This means that it will “copy symlinks as symlinks”. But this also means that hard links will be treated as regular files. As a result, the destination copy will take a whoooole lot more space than the source. Because I keep quite a long history, the initial 2TB repository reached more than 10TB in the copy space…

When using the --hard-links option, rsync(1) preserves hard links. And I got a perfect copy of the initial repository. Note to self, use:

# rsync -avP --hard-links --delete /backup_orig/* /backup_new/

The vP options produces a bunch of information to monitor what happens during the copies (initial and retries). The delete option makes sure the destination is the exact copy of the source ; deleting from the destination any file that is not present on the source. This is needed when the source is being modified by the regular backup processes while the rsync process to the destination is still running.