[SEE UPDATE due to changes in a Snow Leopard patch]
I've finally completed a whole RAID 1 backup cycle with Snow Leopard and I can reliably report on how it works.
The process, when performed reliably, is essentially unchanged from earlier versions of Mac OS X. [Details added 3/4/11].
Specifically, you must never attach an old software RAID 1 drive to the working RAID 1 set. If the set was missing a drive ("degraded") before you attach the drive, it will treat the new drive as part of the set. THIS IS BAD.
You must always erase a drive's partition header completely before adding it back in to a RAID set. Otherwise it's misidentified as being an up-to-date part of the RAID 1 set even though it may not have been updated in months.
I had thought that changes made to RAID handling in Snow Leopard might have fixed this problem. Nope.
Why RAID for backup?
I use the RAID 1 mechanism to keep a working, off-site backup of my Mac Pro system. The Pro contains three hard drives:
- System Drive (500 GB)
- RAID 1 drive X (1.5 TB)
- RAID 1 drive Y (1.5 TB)
The two RAID drives appear to the system as a single drive. I use the drive primarily to keep Time Machine backups. I also store a few special, very large files on the RAID, too. These include virtual machine images and my Aperture photo library. Those types of files aren't handled efficiently by Time Machine. I have a separate directory on the RAID system to hold such things.
In this arrangement, if any single drive fails, I can easily recover:
- System drive dies: I install a new system drive and rebuild it from the Time Machine backup.
- RAID drive X dies: I replace the drive and use the "Rebuild" feature in the RAID software.
- RAID drive Y dies: I replace the drive and use the "Rebuild" feature in the RAID software.
RAID by itself isn't a real back-up
The risk remains that I might lose the whole system, like through fire or theft. To avoid this I rely on an off-site backup.
The off-site backup is also a RAID drive. Every few weeks, I swap out one of the RAID drives, and replace it with a blank hard drive. Then I tell the RAID software to rebuild the RAID 1 pair. This copies everything from the existing drive to the new one. Meanwhile I place the swapped-out drive in a safe off-site location.
Each time I perform this swap, I update the off-site backup with the latest set of files saved by Time Machine. It also contains the latest copy of my virtual machines and of my Aperture library.
Updating an off-site RAID Backup
Here is how the process works. Assume that the Mac Pro currently contains two RAID 1 drives we'll call X and Y. I used drive Z as the off-site backup. I will swap out Drive X and swap in Drive Z.
- Bring Drive Z back from its safe, off-site location.
- Connect it to a Windows machine and use DISKPART to completely erase its partition information. There should also be Unix/Linux utilities to do this.
- Shut down your Mac.
- Swap out Drive X and swap in Drive Z.
- Boot the Mac. Click "Ignore" in the unreadable disk alert.
- Start up DiskUtil and format Drive Z. Give it an obvious name like "Empty."
- Use DiskUtil to tell the RAID 1 set to forget about Drive X ("demote" it). This is a two step process that you perform when displaying the RAID set:
First, you select Drive X (the RAID display indicates that it's off-line) and click the "Delete" button, or the "-" button. The "Demote" button appears.
Second, with Drive X still selected, you click the "Demote" button and approve the warning dialog. - DiskUtil should show only one drive in the RAID set now. Drag Drive Z (the "Empty" drive) into the RAID set. The GUI is a little sensitive here: do not "click" on Drive Z, simply depress the mouse button and drag it. if you click on Drive Z, DiskUtil will switch away from the RAID display.
- If all went well, the "Rebuild" button should appear when you select Drive Z in the RAID set. Click it, and the drive will rebuild. This takes several hours with large, modern drives.
- Meanwhile, Drive X is your new, up-to-date back-up drive for off-site storage. You can put it away while the other drive is rebuilding itself.
This is all especially convenient with a Mac Pro because the drives easily pop in or out on simple brackets. The drives slide in and attach directly to SATA connectors.
Using DISKPART
We have to use DISKPART because the Mac is too stupid to recognize an out-of-date RAID drive. Our Drive Z is usually going to be an older drive pulled out of the same RAID set. If we simply reattach it to the Mac, the RAID software will add it back to the RAID set. Don't be misled by the dire warnings of DiskUtil that you can't possibly add a "demoted" drive back into a running set. It has happened to me and it's Bad.
The only way to avoid this is to erase the partition information. DiskUtil can't do this because it treats the drive as a valid RAID set member.
Thus, we have to connect the drive to a completely different system. To be honest I haven't tried connecting it to a different Mac to erase the partition information. I'm skeptical of success, given the difficulties I've had already.
Here is how we use DISKPART:
- Connect Drive Z to another machine (I use a Windows laptop and a SATA to USB converter).
- Log in as an administrator and run DISKPART from the command shell as an administrator.
- Use the DISKPART "clean" command to erase all partition information from Drive Z:
- Warning: it is remarkably easy to "clean" the wrong hard drive! Pay very close attention to what you do in this step!
- DISKPART doesn't use "drive letters," it numbers the drives sequentially.
- Use the list disk command to identify the right disk to clean.
- Use the select disk=# command to select the disk to clean (substitute its number for "#").
- Type detail disk to list out the details of the selected disk. Make sure that the size, connection, format, and other details match the disk you want to erase.
- Type clean to clean off the partition headers.
- Type exit to quit DISKPART.
- Disconnect the drive from the Windows machine.
We can safely attach the drive to the Mac once the partition data is cleaned off.
Danger, Will Robinson!
If you plug the Z drive into a machine with a lot of other drives, Be Very Careful. It is too, too easy to run DISKPART on the wrong drive. Double check the drive's size, label (if any), and method of connection to ensure you erase the correct drive.
If you run DISKPART on the wrong drive, you're reduced to using file recovery tools to get back the contents of the erased drive. I've been there and it's not a nice place to go.