Saturday, March 01, 2014

Why I am spending a good part of my weekend struggling with a backup scheme.

I continually struggle with the best means of keeping our personal digital information safely backed up. I recently upgraded my wife's photography PC to a 4Tb hdd and it is taking several days to transfer all files and set up an appropriate back up regime. How do other people with less patience that I have manage this?

First a confession: I do not back up my own files any more. I have handed over responsibility for doing so to Google, Microsoft and Dropbox. However my wife is a keen amateur photographer who generates hundreds of gigabytes worth of images every year and cloud storage would be prohibitively expensive for this amount of data.

At first glance the problem and solution are fairly straightforward because there is actually only one practical and economically feasible method of backing up a multi terabyte hard disk: copy the files to a USB connected external hard disk. Unfortunately even this apparently simple approach turns out to be more complex than it first appears:

First off there is the question of what software to use to automate the backup process. I have no doubt that most operating systems offer built in back up facilities but I have been using the excellent Karen's Replicator for many years. The wonderful Karen Kenworthy died at a tragically early age back in 2011 but her useful Windows tools are still available from here: and her Replicator still works flawlessly on Windows 8.1.  The beauty of Replicator is its simplicity. You can set it to run incremental backups from one location to another and it recreates file and directory structures exactly. It doesn't compress and it doesn't encode so the backup is readable by just about anything which ensures maximum longevity. Replicator's file by file copy is not fast but it doesn't have to be because it runs once per day (you can set the schedule) in the background and only copies files that have changed.

The next unexpected issue is that some USB connected external disks go to sleep when the computer is not in use and do not wake up again automatically. Karen's replicator cannot do a backup if the destination drive is offline. I have had this problem with two different external disk drives and it is most frustrating. My current stopgap fix is a timer on the electrical socket powering the external disk drive set to cycle the power off and on just before a scheduled back up.

The next issue is what do you do when a disk drive fills up? 1 Tb sounded like  an enormous amount of space a few years ago but my wife can generate 100Gb of image files in less than a month. Thankfully the price of disk drives is not that expensive but you need to decide whether the new drive will be internal or external and whether it will be in addition to any existing drives or whether or not it will replace them. We have tried just about every combination of additional storage over the years and I have come to the conclusion that the more individual disk drives you have connected to a PC (we had six at one stage) the more problems you will have with them. The best solution, if you can manage it, is to replace everything with a single new drive that is big enough to hold all your old files and  give you additional space. This is also the best way to ensure data longevity (see below).  Of course any new disks you add will need to be backed up so remember you will always need to buy two of everything.

At some stage in this process you are going to have to copy terabytes of files from one location to another all in one go. Set aside a couple of days for this. Seriously. You might think you can just do a Windows copy from one disk to another and let it run over night. You would be wrong.  In the first instance Windows copy doesn't verify after it copies so I recommend using a tool like fastcopy instead. Fastcopy is also a good deal quicker than Windows copy which is a bonus. In the second instance your massive copy is likely to crash some time in the middle of the night with file permissions errors. Then you need to navigate Windows completely non-intuitive file security interface to actually get the file permissions you need to do the copy. Having administrator privileges isn't enough.  If Windows thinks you aren't the owner of a file created ten years ago on a computer you cannot even remember then it will crash your copy. Perhaps you don't need all those system files with special privileges needed to copy but life is just too short to go through them all to figure out which ones are generic and which ones are those special Photo-shop actions you need to keep. Even after dealing with all of that you may still face the problem of an external hard disk going to sleep, during a multi hour backup. Arghhh. Another reason to use fastcopy instead of Windows copy is that it does incremental copying so you don't have to start again from scratch if your massive copy run crashes some time in the middle of the night.

If you are serious at all about backup you will realise that having a single backup sitting beside your PC is not enough. A single lightning surge on the mains could kill both primary and backup drive. To do the job properly you really need a third copy off site. Now I will admit that we cheat a bit on this. We don't keep a regularly updated off site copy but whenever we replace a hard disk we send the old one to a relative. Given that this only happens every couple of years we would still lose a lot of stuff in a catastrophe but at least the kid's baby photos are safe and I guess if the house does burn down we will have more to worry about than last years holiday snaps.

Finally and perhaps most problematic of all is the issue of long term data storage. Our photos and videos will still be important to us and to our  children in fifty years time (I am an optimist - I aim to hit the century). Modern hard disks have only a TWO YEAR warranty. This is actually a shorter warranty than disks had a decade ago which is worrying. Do the disk manufacturers know something that we don't? I think it is unreasonable to expect to be able to retrieve data from a hard disk that has been sitting on a shelf for ten years, never mind fifty. This problem is compounded by the advance of technology. When we got married back in the early 1990's the dominant backup technology was the floppy disk. Try getting a photo off one of those onto your Ipad. Even in the last decade SATA has replaced IDE as the internal disk drive interface. The USB interface has managed to maintain backwards compatibility through several revisions but that won't last forever. I think the only viable solution to this issue is to make a new copy of all your old files every time you upgrade. When USB 4.0 is brand new it will still be possible to retrieve data from a USB 3.0 disk but that may not be the case ten years down the road. The fresh copy on a brand new disk should also protect you for a few more years against your data perishing. There is a small danger of losing integrity during the copying process which is why it is essential to verify your copy immediately after it is made (fastcopy can do this). Of course you must also back up the new disk immediately to protect against infant mortality. 

So there you have it. Backup is far from a simple process.  Am I over thinking this? I am pretty sure that most people do not set aside most of a weekend to setting up a backup process every time they buy a new hard disk?  Nevertheless I cannot think of a way to do it more quickly and still ensure that our data will still be available to us in the future.

Edit: For completeness I should point out that the last time I did this I used disk cloning rather than file copying. Disk cloning creates an exact copy of a disk on to a brand new disk or disk partition. It is generally quicker and easier than file by file copying and it gets over the whole issue of file permissions.  This time however I deliberately decided to do file by file copying for several reasons: Firstly every time you clone a disk you end up with a new physical or logical disk hanging off your PC. I really wanted to reduce the multiplication of drives so I moved all the archived stuff to folders on one big drive instead. Secondly even though cloning ignores file permissions it doesn't solve the problem. When you try to access files on a cloned disk from another PC you may well find you cannot because the new PC thinks they don't belong to you. 

No comments: