Rsbackup - Introduction (Read me first)


rsbackup is a client/server backup system. It is built as a set of Perl wrappers around rsync (https://rsync.samba.org/). The client and server may be installed on any system supporting rsync and a minimal number of Perl modules.

 rsbackup has these main design considerations:

  • The server never has access to the client machines; all communication is initiated by the client
  • The server has a means to detect invalid connection attempts and block them (with reporting)
  • The client reports success/failure and some basic statistics on each backup job
  • The server stores multiple incremental backups in a highly efficient manner. This is done via hard links and is thus not available on systems where the underlying file structure does not have this capability.
  • Server security
  • Recovery in case of failures during backup

Overview of Operation

In each step, you will see "Server Validates connection." Depending on configuration, server will validate any combination of

  • source IP address
  • date/time of initiation
  • Authentication Tokens (ie, public key, password, etc...)
  • command given by client (ie "Prepare", "Backup", "Cleanup", etc...)
    1. (Optional) Client sends "Prepare" command to Server
      1. Server Validates connection
      2. Server executes whatever Prepare commands are configured for the client. Examples might be to create a copy of the existing backup in a date/time stamped directory tree. Note: arbitrary commands are not allowed. The commands to be executed by the Prepare command must be configured on the server side.
      3. Server replies to client with output of Prepare command(s)
    2. Client sends backup request to server
      1. Server Validates connection
      2. Server Validates backup command
      3. rsync session initiated
    3. (Optional) Client sends "Cleanup" command to server
      1. Server Validates connection
      2. Server executes whatever Cleanup commands are configured for the client. Examples might be to remove old trees created by a versioning system, calculate and report amount of disk space used/available, etc...
      3. Server replies to client with output of Cleanup command(s)
      (Optional) Client sends report of backup sessions (including output of Prepare and Cleanup) to monitoring e-mail address.

 

rsbackup is available as Debian packages from the Daily Data server. To use this, create the file /etc/apt/sources.list.d/dailydata.list with the following contents:

#
# Daily Data Repository
#
deb http://debian.dailydata.net/debian_repository /

Then updated your available packages with apt-get update. After this, you can find the rsbackup packages using aptitude.

Note: the dailydata repository does not have an authentication key at this time. At some point I will do the research to get a signing key set up for the repository.

Hints

Rename = delete and add

When a file or directory is renamed, rsync considers it to be a deletion of the old object and the creation of a new one. Not a big thing if it is one small file, but renaming StarTrek.m4v to StarTrekOriginal.m4v means the file StarTrek.m4v will be deleted from the server (it is no long on the client) and the new file StarTrekOriginal.m4v will be backed up.

This is bad enough when you do it to a 1G movie file, but imagine if it (and a few hundred others like it) were stored in the directory "scifi" and we decide to rename it to "Science Fiction".

The only solution I have come up with to avoid this would be to actually store files based on their checksum, then look for the checksum on the target server. In this case, we would maintain the directory structure of the client machine in a list, probably a database, pointing to a balanced binary tree of files. It would be extremely efficient, but I just don't have time right now to consider it.

Seed Backups

rsync is very efficient when it comes to finding deltas (differences between two file systems). It not only limits the copies to only the files which have changed, it actually takes the changed files and tries to determine what has changed and only copy that information.

Deciding to back up a system containing a lot of information for the first time, however, requires that all files be copied. In this case, I generally create a "seed backup" on a removable drive, then hand carry that to the server and copy the files where they are supposed to end up.

Then, and this is the important part, I put the client in "dry run" mode and do a backup. Dry Run only tells you what it wants to do; it does not actually do the copies. You can then look at the log created and make sure you put the files in the correct location. Once you have tested, you can turn off "dry run" mode and let your backups proceed normally.

Versions must be on same file system

If you use the version add-in, realize that it uses cp -al, creating hard links. Hard links are only available if the source and target are on the same partition. You can not put your versions on a different partition from your backups (read up on hard links and you'll understand why).

Last update:
2014-10-29 08:01
Author:
Rod
Revision:
1.3
Average rating:0 (0 Votes)

You cannot comment on this entry

Chuck Norris has counted to infinity. Twice.