LinkedIn Sourceforge Twitter

Vincent's Blog

Pleasure in the job puts perfection in the work (Aristote)

My simple Time Machine on top of NFS

Posted on 2016-07-24 13:58:00 from Vincent in NFS

In this post I'll explain what I did in order to have the concept of Time machine on my NFS server. As you can imagine this solution relies mainly on the HardLink concept and use rsync to perform the tasks. All of this is embedded in a small shell script. It works well on my OpenBSD machine, but should also works on any Unix-like machines.


Download

You can get this small script on SourceForge.

Concept

By doing this small script, my goal was to have a NAS where I can retrieve old versions of my files. To avoid to use too much space on the disks, the solution relies on the Hardlink concept.

In short, the same files are stored only once, but are referenced several time by different path's names.

More specifically, each backup create a folder formated by a specific name with a reference to a date. All target files by the backup system are referenced inside this folder. In other words, by looking in the correct folder's date, you will have all files as it was on that exact day. You can travel in the time. This is nothing less that the concept of the "time machine".

For those backups, I'm using a specific features of rsync relying on the HardLink.

All the rest of the script is just "salt and pepper" in order to have something you can execute from your crontab or from the console (interactive). Like removing the oldest backups; the backups older than the pre-defined value.

The script checks if there is really a need for a backup. You can use 2 different methodologies to trigger a backup:

  • Size: The folder's size change compared to the previous backup. For example, if you add or remove one file in the targeted folder, this parameter will trigger a backup. But if you change a file which has an impact on the file's size, this will also trigger a backup.
  • Full: The backup will be triggered by a folder's size change or by a file's change (mtime) since the last backup. With this option, you can trigger a backup even if the change you performed does not impact the file's size.

The first option is mainly useful for folders having lot of binaries (Photos, movies, ... )
The second option is mainly useful for folders having text files (code, html, ... ). In fact where changes could not impact the "sizes".

The design

On every run, the script will check for triggers. If yes, a new folder will be created and we will store there all current files. By default the script is foreseen to runs 1x per day.

The folder you want to backup must be named current and must have a config file called
`.time_machine'. This is the config file for this folder.

The config file must have 1 mandatory parameter: backup_type.
This parameter can accept 2 values: full or check_only_size:

The next parameter is optional: historical_retention.

With this parameters you inform the script by when backups are too old and can be removed. By default, the value is 10. Please notes that the retention is per backup. So, if you perform a backup every hour, it will be for 10 hours minimum. If you perform a backup every day, it will be for 10 days minimum. Indeed, if there is no reasons to
perform a backup, the script will not create a backup folder.

The next parameter is optional: folder_pattern
As said here above, it provide you a way to name each backup folder with timing
parameters.

After each execution, the script could create one backup folder next to "current".

Some exapmles

Here after my photo's folder:

obsd59:/net/nas/share/photo$ ls -al
total 144
drwxr-xr-x   8 vi    wheel   512 Jul 19 01:35 .
drwxr-xr-x   5 vi    wheel   512 Jul 19 17:32 ..
-rw-r--r--   1 root  wheel   108 Jul 19 01:35 .time_machine
drwxr-xr-x  20 vi    wheel  1024 Mar  2 22:24 20160601
drwxr-xr-x  20 vi    wheel  1024 Mar  2 22:24 20160602
drwxr-xr-x  20 vi    wheel  1024 Mar  2 22:24 20160605
drwxr-xr-x  20 vi    wheel  1024 Mar  2 22:24 20160614
drwxr-xr-x  20 vi    wheel  1024 Mar  2 22:24 20160719
drwxr-xr-x  20 vi    wheel  1024 Mar  2 22:24 current
obsd59:/net/nas/share/photo$ more .time_machine   
backup_type=check_only_size
historical_retention=5
folder_size=513454000 # calculated on 19-07-2016 01:35:08

Here folder of my projects' source code:

obsd59:/net/nas/personal_files/vi/projects$ ls -al
total 528
drwxr-xr-x   28 vi    vi  1024 Jul 21 01:43 .
drwxrwx---   12 vi    vi   512 Feb 26 17:53 ..
-rw-r--r--    1 root  vi    98 Jul 21 01:43 .time_machine
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160115
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160116
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160117
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160202
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160215
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160229
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160404
drwxr-xr-x  102 vi    vi  3584 Jan  5  2016 20160505
drwxr-xr-x  104 vi    vi  3584 May 16 19:09 20160517
drwxr-xr-x  105 vi    vi  3584 May 18 14:34 20160519
drwxr-xr-x  105 vi    vi  3584 May 18 14:34 20160522
drwxr-xr-x  104 vi    vi  3584 Jun  1 08:47 20160602
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160604
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160605
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160607
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160611
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160612
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160614
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160625
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160629
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160701
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160717
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160719
drwxr-xr-x  104 vi    vi  3584 Jun  3 10:31 20160720
drwxr-xr-x  105 vi    vi  3584 Jul 20 17:05 20160721
drwxr-xr-x  105 vi    vi  3584 Jul 20 17:05 current
lrwxr-xr-x    1 root  vi     8 Jul 21 01:41 previous -> 20160721
obsd59:/net/nas/personal_files/vi/projects$ more .time_machine 
backup_type=full
historical_retention=25
folder_size=13269296 # calculated on 21-07-2016 01:43:30

Extra note:

Because the current "view" of a targeted folder is in "current", you can "hide" this time machine principle to your end user by doing soft links to it.

At home all my kids have, in their home's dir of their own machines, a folder called "nas" linked to their own folder on the NAS. This folder is backuped by the Time Machine. This setup assure me that their files are backuped every day automatically.

~$ ls -al nas  
lrwxr-xr-x  1 ta  ta  44 Jul 18 13:36 nas -> /net/nas/personal_files/ta/documents/current

Moreover, the backup directories are mounted; via NFS, in readonly in their home directory as "time machine". By doing this way, they can recover their old version easily without breaking it (by removing or renaming backuped files).

lessons learned

All in all, after 2 years (all backups have reached their maximum occurrences of retention), the extra space taken by the hardlinks is less than 9%. (i've compared the size of the "current" folders and the size of the whole backups with the du command). But, this ratio can be very different if I had removed several big files.
Thanks to this backup system, I've saved some of my files after of wrong manipulations.



2, 2
displayed: 446



What is the second letter of the word Python?