giovedì, aprile 17, 2008

iNodesBackup

I've been looking for a long time for a linux backup software that fits my needs, but the closest I found is rdiff-backup, which is a great piece of software, but has also a big problem: it doesn't understand when files get moved or renamed so that it treats such files as new ones,  wasting a lot of space.
Eventually I ended up creating my own script, that is not perfect, but at least suits all my needs.

These are the operations my iNodesBackup does:

- a base directory (ex: /home) is scanned for every sub-directory containing a file named ".backup"
- every file in this directories is hardlinked in a temporary (but permanent) pool-directory and is renamed so that "file_name == inode number"
- every file in these directories is listed with its attributes in an index file named with the current timestamp
- every inode-file in the pool-directory that must not be backed up anymore (i.e. is outside a directory containing a ".backup" file) gets signaled in the standard output
- every inode-file in the pool-directory having only one inode (i.e. has been deleted) gets signaled in the standard output and gets moved in the "orphans" sub-directory of the pool-directory
- if the sub-command "rdiff-backup" is used, the pool-directory (excluding the "orphans" sub-dir) gets backed up to a directory (usually on another volume) using rdiff-backup that saves a lot of space keeping only one version of every file along with their modifications
- if the sub-command "backup" is used, the rdiff-backup process is skipped; this may be useful to quickly test your command line, to save files before deleting them, or for any other purpose you may think to.

I chose to just notify the presence of files to be deleted because I don't like so much the idea of deleting dozens of files automatically on a server.
I prefer to manually examine every file in the "orphans" directory in order to decide on-the-fly if an inode-file must be definitively deleted or not, therefore I added the sub-command "orphans" to perform this task.

So far I've never needed to restore anything from the backup, but if and when it happen I just will need to grep the index files searching for the filename I'll need.

Here is the link to my script:


To give you an idea of how to use it, this is the command line I use in my crontab:

mount /backup && { iNodesBackup rdiff-backup /home /backup/home ; df -h /backup ; umount /backup ; }

A note: in order to be able to backup directories in the root (/) file-system, I had to use  the "-xdev" option in the "find" command that searches for directories containing the ".backup" file; as a result, when you specify the base directory to scan and backup, you must keep in mind that iNodesBackup will consider only the current file-system avoiding to scan any mounted one below.

I created this script on an Ubuntu Gutsy (7.10), but it should work on any other modern linux system.

If you find any bug, if you make improvements or if you just want to say "Hey ! This is what I was looking for !", then please let me know posting a comment here.

3 commenti:

Anonimo ha detto...

Hi,

This sounds just what I need! I have a number of offices with large amounts of data which doesn't change often, however sometimes people decide to shuffle folders around or rename them to archives. This can result in gigabytes being transferred when really there's no data change. I am considering a slightly more complex system as the rdiff-backup on the systems are different versions, and since rdiff-backup needs to complete fully, it might fail a few times and retransmit data, as the links between the offices are internet links and prone to failure. I'm thinking of using rsync to synchronise the files by inode, then maybe recreating a tree at the backup server end, then either rdiff-backup locally or a hardlink history using snapback or rsnapshot or something.

Thanks for posting this, I will give you more information if/when I get it going!

Josh.

Mico ha detto...

Wow, someone eventually found my almost-empty blog and found my script useful ! Awesome ! How in the world did you happen to come here ? I'm really curious ! :-)

I'm very happy my work may be useful to someone else... if you modify/improve iNodesBackup, it would be really nice if you share your modifications.

Bye

Jozef Riha ha detto...

hey, your script really looks useful - i do frequent file renames/moves also but what i dislike about your solution is an extra file in every directory. if this is stored in one place, it'd be great.