In this post I hope to help you understand how to use Rdiff-backup. I stumbled across Rdiff-backup several years ago. It has helped me streamline and simplify most of my file based backup processes from Linux servers. As the name implies, a diff is performed on the files being backed up, so only differences get backed up. As you can probably guess only storing the diffs of changes can lead to a much smaller backup.
Some advantages of Rdiff-backup is that it utilizes rsync, so it can quickly and easily mirror a directory. Backups can happen in seconds if there have been few changes. Your data all travels over a secure ssh connection, so your files are safe during transport. And being a simple command line tool, you can easily script out a backup scenario. Or just use it straight from your terminal.
How I Use Rdiff-backup
I typically use Rdiff-backup with my web hosting clients, it allows for quick backup and restore of their web files. And because most of the the files don’t change from day to day the backup is lightning fast.
rdiff-backup --exclude '**cache/'
--exclude '**debug.log'
/var/www
user@ipAddress::/home/user/backup
rdiff-backup --remove-older-than 52W user@ipAddress::/home/user/backup
Let’s break down the command(s), the first Rdiff-backup command has two “–exclude” options. Those options will as indicated exclude the referenced files or directories from the backup. The exclude options can either have a full relative directory structure or in this case a “**” will match any path. A single asterisk “*” could also be used, but it matches any part of a path not containing a “/” in it.
The “/var/www” part of the command is the directory to backup. Using standard ssh authentication methods “user@ipAddress” ie:”bob@10.1.1.1″ for the login credentials. And then a double colon, which is different from normal rsync/scp formatting. The “::” proceeds the backup destination directory.
So to explain in basic terms, the first command will backup everything under the “/var/www” directory, except any directory ending in “cache/” or file ending in “debug.log”. The backup will be made in the “/home/user/backup” directory.
Backup Lifecycle
Now that you have a backup created how do you manage how long the backup will be kept? Without a command like the second one above, the Rdiff-backup will be kept indefinitely. But adding the second command helps us manage the how long to keep the backups.
rdiff-backup --remove-older-than 52W user@ipAddress::/home/user/backup
This command is a little simpler than the first, it doesn’t specify a source directory. But rather only specifies how long to keep files. The “–remove-older-than” option can take a number of different options. I like to keep my backups for a year, so “52W” gives me 52 weeks of backups. Any existing diffs older than the specified time are removed. Other options are s, m, h, D, W, M, or Y (indicating seconds, minutes, hours, days, weeks, months, or years respectively). Additionally a “B” can be used to indicate the number of backups, ie: “3B” would keep the last three backups.
After specifying the number or timeframe of backups to keep, the only other thing to specify is the backup location. The trimming of backup content is then performed on the given location.
Performing a Restore
So now you have your content backed up, and you need to restore something. A backup is only as good as the data you can restore out of it right? Fortunately the restore process fairly simple as well.
rdiff-backup -r 5D user@ipAddress::/home/user/backup/example.com/index.php /home/localUser/www/restore/
The “-r” option tells Rdiff-backup to restore files, and uses the same time format as the delete option above. In this case we are restoring a version of the file “example.com/index.php” from 5 days ago. The file is being restored/copied to “/home/localUser/www/restore/”. The same can be done for an entire directory structure.
rdiff-backup -r 3B user@ipAddress::/home/user/backup/example.com/ /home/localUser/www/restore/
This command restores/copies all the contents of the example.com directory to “/home/localUser/www/restore/” from 3 Backups ago. Or if you are hunting for a specific day you can always do something like this.
rdiff-backup -r 03-05-2020 user@ipAddress::/home/user/backup/example.com/ /home/localUser/www/restore/
That will perform the same restore, but specifically as of the 5th of March 2020. The date used can be “03/05/2020” or “2020-03-05”, and all indicate midnight as of that day.
For a full rundown on all the options and other details for Rdiff-backup check out the project documentation.
There you have a basic rundown on how to use Rdiff-backup. I find it a very useful and powerful tool, and hope it will help you keep your backups running.