rsnapshot HOWTO

2004-01-20

Revision History
Revision 0.9.72005-01-17NR
Spelling corrections submitted by Nicolas Kaiser
Revision 0.9.62004-12-13NR
Misc. updates
Revision 0.9.52004-07-10NR
Relicensed document under GPL, instead of FDL
Revision 0.9.42004-07-02NR
Added description of proper crontab time settings
Revision 0.9.32004-06-11NR
Misc. updates
Revision 0.9.22004-05-16NR
Updated --link-dest info
Revision 0.9.12004-01-20NR
Added --link-dest info
Revision 0.92004-01-10NR
First draft

Abstract

rsnapshot is a filesystem backup utility based on rsync. Using rsnapshot, it is possible to take snapshots of your filesystems at different points in time. Using hard links, rsnapshot creates the illusion of multiple full backups, while only taking up the space of one full backup plus differences. When coupled with ssh, it is possible to take snapshots of remote filesystems as well. This document is a tutorial in the installation and configuration of rsnapshot.


rsnapshot is a filesystem backup utility based on rsync. Using rsnapshot, it is possible to take snapshots of your filesystems at different points in time. Using hard links, rsnapshot creates the illusion of multiple full backups, while only taking up the space of one full backup plus differences. When coupled with ssh, it is possible to take snapshots of remote filesystems as well.

rsnapshot is written in Perl, and depends on rsync. OpenSSH, GNU cp, and the BSD logger program are also recommended, but not required. All of these should be present on most Linux systems. rsnapshot is written with the lowest common denominator in mind. It only requires at minimum Perl 5.004 and rsync. As a result of this, it works on pretty much any UNIX-like system you care to throw at it. It has been successfully tested with Perl 5.004 through 5.8.2, on Debian, Redhat, Fedora, Solaris, Mac OS X, FreeBSD, OpenBSD, and IRIX.

The latest version of the program and this document can always be found at http://www.rsnapshot.org/.

This section will walk you through the installation of rsnapshot, step by step. This is not the only way to do it, but it is a way that works and that is well documented. Feel free to improvise if you know what you're doing.

In this example, we will be using the /.snapshots/ directory to hold the filesystem snapshots. This is referred to as the “snapshot root”. Feel free to put this anywhere you have lots of free disk space. However, the examples in this document assume you have not changed this parameter, so you will have to substitute this in your commands if you put it somewhere else.

Also please note that fields are separated by tabs, not spaces. The reason for this is so it's easier to specify file paths with spaces in them.

This is the section where you tell rsnapshot what files you actually want to back up. You put a “backup” parameter first, followed by the full path to the directory or network path you're backing up. The third column is the relative path you want to back up to inside the snapshot root. Let's look at an example:

backup      /etc/      localhost/etc/

In this example, backup tells us it's a backup point. /etc/ is the full path to the directory we want to take snapshots of, and localhost/etc/ is a directory inside the snapshot_root we're going to put them in. Using the word localhost as the destination directory is just a convention. You could just as easily use etc/ all by itself, or the server's proper domain name instead of localhost. If you are taking snapshots of several machines on one dedicated backup server, it's a good idea to use their various hostnames as directories to keep track of which files came from which server.

In addition to full paths on the local filesystem, you can also backup remote systems using rsync over ssh. If you have ssh installed and enabled (via the cmd_ssh parameter), you can specify a path like:

backup      root@example.com:/etc/     example.com/etc/

This behaves fundamentally the same way, but you must take a few extra things into account.

With this parameter, the second column is the full path to an executable backup script, and the third column is the local path you want to store it in (just like with the "backup" parameter). For example:

backup_script      /usr/local/bin/backup_pgsql.sh       localhost/postgres/

In this example, rsnapshot will run the script /usr/local/bin/backup_pgsql.sh in a temp directory, then sync the results into the localhost/postgres/ directory under the snapshot root. You can find the backup_pgsql.sh example script in the utils/ directory of the source distribution. Feel free to modify it for your system.

Your backup script simply needs to dump out the contents of whatever it does into it's current working directory. It can create as many files and/or directories as necessary, but it should not put its files in any pre-determined path. The reason for this is that rsnapshot creates a temp directory, changes to that directory, runs the backup script, and then syncs the contents of the temp directory to the local path you specified in the third column. A typical backup script would be one that archives the contents of a database. It might look like this:

#!/bin/sh

/usr/bin/mysqldump -uroot mydatabase > mydatabase.sql
/bin/chmod 644 mydatabase.sql

There are several example scripts in the utils/ directory of the rsnapshot source distribution to give you more ideas.

[Warning]Warning
Please remember that these backup scripts will be invoked as the user running rsnapshot. In our example, this is root. Make sure your backup scripts are owned by root, and not writable by anyone else. If you fail to do this, anyone with write access to these backup scripts will be able to put commands in them that will be run as the root user. If they are malicious, they could take over your server.

We have a snapshot root under which all backups are stored. By default, this is the directory /.snapshots/. Within this directory, other directories are created for the various intervals that have been defined. In the beginning it will be empty, but once rsnapshot has been running for a week, it should look something like this:

[root@localhost]# ls -l /.snapshots/
drwxr-xr-x    7 root     root         4096 Dec 28 00:00 daily.0
drwxr-xr-x    7 root     root         4096 Dec 27 00:00 daily.1
drwxr-xr-x    7 root     root         4096 Dec 26 00:00 daily.2
drwxr-xr-x    7 root     root         4096 Dec 25 00:00 daily.3
drwxr-xr-x    7 root     root         4096 Dec 24 00:00 daily.4
drwxr-xr-x    7 root     root         4096 Dec 23 00:00 daily.5
drwxr-xr-x    7 root     root         4096 Dec 22 00:00 daily.6
drwxr-xr-x    7 root     root         4096 Dec 29 00:00 hourly.0
drwxr-xr-x    7 root     root         4096 Dec 28 20:00 hourly.1
drwxr-xr-x    7 root     root         4096 Dec 28 16:00 hourly.2
drwxr-xr-x    7 root     root         4096 Dec 28 12:00 hourly.3
drwxr-xr-x    7 root     root         4096 Dec 28 08:00 hourly.4
drwxr-xr-x    7 root     root         4096 Dec 28 04:00 hourly.5

Inside each of these directories is a “full” backup of that point in time. The destination directory paths you specified under the backup and backup_script parameters get stuck directly under these directories. In the example:

backup          /etc/           localhost/etc/

The /etc/ directory will initially get backed up into /.snapshots/hourly.0/localhost/etc/

Each subsequent time rsnapshot is run with the hourly command, it will rotate the hourly.X directories, and then “copy” the contents of the hourly.0 directory (using hard links) into hourly.1.

When rsnapshot daily is run, it will rotate all the daily.X directories, then copy the contents of hourly.5 into daily.0.

hourly.0 will always contain the most recent snapshot, and daily.6 will always contain a snapshot from a week ago. Unless the files change between snapshots, the “full” backups are really just multiple hard links to the same files. Thus, if your /etc/passwd file doesn't change in a week, hourly.0/localhost/etc/passwd and daily.6/localhost/etc/passwd will literally be the same exact file. This is how rsnapshot can be so efficient on space. If the file changes at any point, the next backup will unlink the hard link in hourly.0, and replace it with a brand new file. This will now take double the disk space it did before, but it is still considerably less than it would be to have full unique copies of this file 13 times over.

Remember that if you are using different intervals than the ones in this example, the first interval listed is the one that gets updates directly from the main filesystem. All subsequently listed intervals pull from the previous intervals. For example, if you had weekly, monthly, and yearly intervals defined (in that order), the weekly ones would get updated directly from the filesystem, the monthly ones would get updated from weekly, and the yearly ones would get updated from monthly.

When rsnapshot is first run, it will create the snapshot_root directory (/.snapshots/ by default). It assigns this directory the permissions 700, and for good reason. The snapshot root will probably contain files owned by all sorts of users on your system. If any of these files are writeable (of course some of them will be), the users will still have write access to their files. Thus, if they can see the snapshots directly, they can modify them, and the integrity of the snapshots can not be guaranteed. Why even bother with backups when anyone can change them?

If users need to be able to pull their own backups, you will need to do a little extra work up front (but probably less work in the long run). The best way to do this seems to be creating a container directory for the snapshot root with 700 permissions, giving the snapshot root directory 755 permissions, and mounting the snapshot root for the users read-only. This can be done over NFS and Samba, to name two possibilities. Let's explore how to do this using NFS on a single machine:

Set the snapshot_root variable in /etc/rsnapshot.conf equal to /.private/.snapshots/

snapshot_root       /.private/.snapshots/

Create the container directory:

mkdir /.private/

Create the real snapshot root:

mkdir /.private/.snapshots/

Create the read-only snapshot root mount point:

mkdir /.snapshots/

Set the proper permissions on these new directories:

chmod 0700 /.private/
chmod 0755 /.private/.snapshots/
chmod 0755 /.snapshots/

In /etc/exports, add /.private/.snapshots/ as a read only NFS export:

/.private/.snapshots/  127.0.0.1(ro,no_root_squash)

In /etc/fstab, mount /.private/.snapshots/ read-only under /.snapshots/

localhost:/.private/.snapshots/   /.snapshots/   nfs    ro   0 0

You should now restart your NFS daemon.

Now mount the read-only snapshot root:

mount /.snapshots/

To test this, go into the /.snapshots/ directory as root. It is set to read-only, so even root shouldn't be able to write to it. As root, try:

touch /.snapshots/testfile

This should fail, citing insufficient permissions. This is what you want. It means that your users won't be able to mess with the snapshots either.

Now, all your users have to do to recover old files is go into the /.snapshots directory, select the interval they want, and browse through the filesystem until they find the files they are looking for. They can't modify anything in here because NFS will prevent them, but they can copy anything that they had read permission for in the first place. All the regular filesystem permissions are still at work, but the read-only NFS mount prevents any writes from happening.

Please note that some NFS configurations may prevent you from accessing files that are owned by root and set to only be readable by root. In this situation, you may wish to pull backups for root from the "real" snapshot root, and let non-privileged users pull from the read-only NFS mount.