Wednesday, November 30, 2005

Dead Servers and RAID vulnerabilities

This is a follow-up to last Friday's Backup Reminder. The Ur-Guru pointed out to me that the dead-server, dead-backups double whammy couldn't have happened the way I described it, because "You can't be running a functional bootable system with RAID1 mirroring if the second disk(s) are dead and not functioning. Every write operation is done twice for RAID1. Guess what happens if one of the two (or every 2nd mirror disk) is dead and doesn't respond?"

Which shows you what I know about RAID.

The Ur-Guru advised editing my original post so I wouldn't look so ignorant, but the truth is that when it comes to RAID and other advanced computing issues, I am ignorant. In case I haven't mentioned it lately, I am not a Real Geek. (Well, yes, I am a geek, but not a computer geek. I'm a classicist by training, and we're plenty geeky, but mostly involving things that happened a couple of thousand years ago.)

So instead I'm posting a correction.

I still don't know what the exact details of this particular ongoing drama are (except that the company may have lost 3 years' worth of transactions and other data), but as I thought back on what the business owner told me, it seemed to me that it wasn't that half the array had been dead for months, but that they'd had to replace one of the disks a few months ago and now the second disk had gone and the RAID mirror hadn't been mirroring.

The Ur-Guru (never short of opinions) responded to my conjecture with one of his own:
They had a functional RAID1 setup.
The mirror disk failed.
They replaced the mirror disk.
They had a cheapo (probably onboard) RAID controller that does not do an automatic rebuild so the replaced mirror disk never got mirrored properly (it was just a clean disk in there with no data).
Then the real disk failed.
Then they hoped the mirror would save them, but the mirror was never a mirror after the replacement.

A serious controller would've asked whether to rebuild the mirror or not, or would have been set to do it automatically. Once the disk was replaced with a different one that is the common thing it does. The cheap onboard controllers require more manual interference and... due to user error they never really had a mirror. I think that is the story.
I don't know whether that's the story in this particular case, but it's certainly something that could happen and something that any readers who do have RAID arrays should be aware of. To reinforce that point, another Real Geek friend chipped in with
Most RAID controllers that I have used won't let the machine boot if both drives aren't working okay, so I don't know what's up with your client's system. I dislike RAID, because only about half the time is it a drive failure that causes you grief. The other half, a virus erases some files, or you delete something by mistake, and the RAID controller obediently makes the same change to your mirrored drive. It gives you a false sense of security, and costs too much for the benefit you do receive.
The point about copying viruses or other errors is a very, very important one to remember. A properly configured mirror will protect you against hardware failure, but not against other threats to your data. RAID is not a replacement for backups.

(And I have no idea what was wrong with the backup system that was supposed to be copying the data on the moribund drive and why there was nothing, or nothing useful, on the tapes.)

Apparently, either none of my other readers noticed it was impossible for one disk in an array to be dead without affecting the others, or no one else actually read the reminder, because no one else said anything.

Which then leads me to wonder whether I shouldn't just have kept my mouth shut in the first place.

Labels:

Friday, November 25, 2005

FileSlinger™ Backup Reminder 11-25-05: Give Thanks for Your Backups

The United States celebrated Thanksgiving yesterday with the usual exuberant overindulgence in turkey, cranberry sauce, and pumpkin pie. Before you head out for the nation’s biggest shopping day, take a moment to be grateful for your backups.

Even though backing up your computer can be a tedious process, and it might be annoying to have all that storage space taken up with your drive images and file copies, or to be sending tapes and CDs offsite, or to be paying monthly fees for an online backup service, those backups are sparing you from far greater inconvenience and expense. Like insurance, backups can mean the difference between staying in business and going out of business.

I’ve talked to three people in as many weeks who’ve had serious computer failures. One colleague was thinking he ought to clean up and back up his Entourage database (which held all his mailing list information as well as his business contacts), but didn’t have time before going out of town. When next he tried to use the database, it had become hopelessly corrupted. Last I heard, the computer was off at DriveSavers (www.drivesavers.com). His chances of getting his data back are good, because there’s no physical damage to the drive, but the price will be high, and the timing was terrible.

There’s never a good time to have your server go down, discover that half your RAID 1 array has been dead for months, and *then* find out that your tape backups have also failed. Running a business which handles hundreds of transactions a day without a computer is no one’s idea of a good time. The dead server is in a white room at Lazarus Data Recovery (http://www.lazarus.com/) and the owner of the afflicted business is looking at alternative backup solutions.

A few days ago I was talking to a friend on the phone and she mentioned that she really would like to get a new computer soon. No sooner had the words left her mouth than her motherboard failed, though it wasn’t until she took the machine to the shop that she knew the problem wasn’t with the drive. Fortunately for her, she had backups of almost everything, no more than a week old—and she has a lumbering old computer that she can use to access those files while the main machine is repaired.

So if you have backups—be grateful. And if you don’t have backups—make some, and give yourself a reason to be grateful. There are probably some great external drives, DVD-burners, and other helpful tools available at the Thanksgiving sales.

Labels: ,

Friday, November 18, 2005

FileSlinger™ Backup Reminder 11-18-05: Getting the WinBackup Uniblues

The first thing you notice about WinBackup 2.0 Pro is that Uniblue not only provides an actual CD, but a printed quick start guide—in almost-flawless English. (Does an “extensible” online libray of white papers stand on telescoped legs like a camera tripod?) There’s also a 54-page PDF manual available from the “Help” menu.

Another thing to like about WinBackup is the modest demands it makes on your hard drive and operating system. SmartSync Pro, which I reviewed for Kickstartnews.com in September 2005, is a good program, but Windows tells me it takes up 239 MB on my modest-sized laptop hard drive, which is less than ideal. Symantec Ghost Corporate 8 Console, by contrast, requires 72.97 MB. More compact than either, WinBackup needs only 41 MB.

WinBackup 2.0 Pro gives you the option to back up to external hard drives, removable drives, USB drives, CD-R, CD-RW, DVD-R, DVD-RW, tape or LAN, with options for filtering and compression, “total” or incremental backup, and individual file restoration.

WinBackup’s interface resembles Windows Explorer crossed with tabbed browsing. Each step in the backup process is indicated by a tab across the top of the screen, below which is the Work Area. On the far left is the Task Pane.

I was curious to discover that Outlook showed up in two places in the Backup Sources list: under “Shortcuts” and under “Agents.” Between the two, WinBackup provides the means to back up not just Outlook PST files but Outlook Options covering everything from Mail format to Preferences to Spelling, not to mention the RSS feeds I get through NewsGator and my Skylook conversations. It’s still a bit confusing to have Outlook in two places, but I believe this is because NewsGator and Skylook rely on Microsoft Exchange.

Other “Shortcuts” include Microsoft Windows, Outlook Express, and Internet Explorer. (I don’t have any other agents, and they don’t seem to be covered in the manual, so I’m not sure what else might show up as an agent in WinBackup.) Or you can select a whole drive to back up.

Once you’ve chosen your source files, WinBackup calculates their size and asks you for your destination. The calculation can take quite a while if you’re backing up a large hunk of data (which my outlook.pst and archive.pst files are), and it’s easy to get impatient with the “Estimating Job Size” progress bar. You can move on to “Settings” before it finishes its estimate, however.

This is where you decide whether you want a total or incremental backup, what level of compression you prefer, whether you want to encrypt your backup, and what actions WinBackup should take before or after backing up. (The default choices are “Close Outlook During Backup” and “Shut Down After Backup.”) You can also set up exclusion filters (by file type) and decide whether to erase the CD/DVD or append a new session, if you’re using optical media. Perhaps most importantly, you can choose to verify your backup.

It was at this point in my initial test that WinBackup started to get upset and toss list exclusion errors (whatever those are) at me. Being impatient to conclude a backup and this review, I went back to the beginning and chose a different set of files to back up onto my external drive.

Scheduling options are fairly primitive: never, daily, weekly, monthly, and yearly. This lacks the flexibility of SmartSync Pro or even the free program Karen’s Replicator which I use for file backups when I start my computer.

My backup job, of a 744 MB folder containing 5262 files, run at high compression with full verification and password-protection, took two hours and 14 minutes over a USB 1.1 connection to an external drive. That’s a very long time to back up such a small quantity of data; the Ghost image of my entire drive only takes about that long (also done over USB 1.1 to the same external drive).

On the other hand, the Ghost backup, although compressed, is not password-protected and isn’t verified. The way I verify it is to open Ghost Explorer and take a look at the files, then extract one at random. (I’ve learned the hard way that if a Ghost backup is damaged, you can’t open it at all.)

Maximum compression reduced my 744 MB of data to a 523 MB .w2b file. (I was already using Windows compression on many of the files; this may be a confounding variable.)

Verification or no verification, the only way to be sure of a backup is to try restoring the files. Clicking WinBackup’s Restore tab takes you to a simple “select backup folder” command line/browser which lets you navigate in typical Windows fashion to your backup file. If the file is encrypted, you’ll get a password prompt, but once you’ve navigated that hurdle, WinBackup will load your backup file into its Work Area under the heading “Restore Source.”

The first time I tried this, the backup archive started to load and then got hung up in the middle, making me wonder whether it would take as long to re-load it as it had to back it up. I shut down WinBackup and restarted it, and then the backup file loaded quickly and gave me a selection of files to restore and a choice of where to restore them.

On my first restore test, I selected “all” under the “Replace” option, and WinBackup restored not only the file, but the entire directory tree, to my Temp folder. On my second test, I picked “None (Restore Missing Files Only)” and got the same result. The trick, I discovered, is to choose “Single Folder” rather than “Alternate Location” under “Restore To”. This will put the files into whichever folder you indicate without replicating the directory structure.

WinBackup 2.0 Professional works, and provides a decent if incomplete range of options. It’s generally easy to use. But it’s slow, and it doesn’t seem to get along very well with my system, for reasons unkown to me and which may be due to something else I have installed on my computer, not to WinBackup alone. I use other programs, such as Microsoft Office, which are at least as prone to mysterious problems. But I need to have absolute confidence in my backup software, and WinBackup doesn’t inspire it.

Labels:

Friday, November 11, 2005

FileSlinger™ Backup Reminder 11-11-05: Online Backups—All You Need is Broadband

Reading cue: items separated from the rest of the text by + are quotations from something I’ve written elsewhere. Items separated by = are quotations from Bob Cramer, used with permission. A series of # indicates a quotation from Mozy.

Bob Cramer at LiveVault is on a crusade to move the world to online backup. (In case you haven’t heard of LiveVault.com, go to www.backuptrauma.com for an entertaining introduction to the motivation behind their disk-based online backup services.)

To quote from Bob’s October 20 “Backup to the Future” blog post:
===============================
Disk will replace tape for backup.

Bandwidth will replace trucks for getting backups offsite.

Continuous data protection (CDP) will be a core to any future data protection product.

And backup and recovery as an ASP service to SMBs makes perfect sense.

So – what’s a CEO to do to get more business owners to understand they NEED this?
================================

What LiveVault is doing is sponsoring a webinar with John Cleese on Tuesday (which I just made an appointment in the middle of—very bright, Sallie).

I do believe it’s important to raise awareness about the need for backups—any kind of backups, even the much-vilified tape. And online backups, particularly of the automated variety, have a lot of advantages to them. The problem, however, is one I put to Bob this way:

+++++++++++++++++++++++++++++
How do you solve the speed problem? Uploading files over a consumer cable internet connection is painfully slow; via dial-up it's impossible.

There are millions of businesses out there without T1 connections, most of them companies which would never have considered using tape backups anyway. Disk-based backup for most of us is an external hard drive.

Give us the bandwidth and we'll jump on the bandwagon—after all, our priceless business data is probably even more vulnerable to disasters like Hurricane Katrina than, and certainly at as much risk of theft as, the tapes which large corporations truck offsite.
++++++++++++++++++++++++++++++

Bob’s response was as follows:

=============================
Unfortunately, the laws of physics sometime get in the way. So the equation looks something like this:

"How much data are you protecting" X "what is the daily change rate of that data" = "amount of data you need to move." You take this and divide by your available upstream bandwidth and it will tell you how long it will take to get the backup done.
Looking at it the other way, and matched against your business' RTO objectives, how much data needs to be recovered and how quickly, gets divided by downstream available bandwidth.

So - here's how we "screw around with physics": (1) continuous protection, (2) content reduction, (3) delta backup, and (4) delta restore - all automated in the background.

(1) continuous protection - this approach (often called CDP) enables you to move the changed data "as it changes" incrementally over the wire. By doing this, you spread bandwidth consumption uniformly throughout the day.

(2) content reduction and (3) delta backup - this makes certain you only move the absolute minimum amount of data over the wire necessary. This includes moving only changed blocks vs. files, and only moving the data once and then incremental forever, and not moving the same file more than once.

(4) delta restore - this takes a look at what data you have on your local computer, and what point in time version you want to recover to, looks at "just the differences", sends only those differences over the wire, and then reweaves the file.

So - of course, you need a broadband connection. But with the best online backup and electronic vaulting solutions out there, the bandwidth requirements can be reduced to the absolute minimum.
===================================

This is a good answer, assuming you know what “delta backup” and “RTO” mean, and Bob sensibly expanded on it a bit and turned it into its own blog post, with live links, on November 9th. (“Delta backup” is just fancy phrasing for only backing up what’s changed, if you were wondering, and “RTO” stands for “Restore Time Objective.”)

But it doesn’t address the real problem, which is a problem for LiveVault and its competitors as much as for the small and home office computer user.

Forget “I want my MTV.” I want cheap broadband. At least I have broadband—but it’s $54/month for cable internet (with a dynamic IP address, yet) in the East Bay, with no provision for dial-up connections if the cable goes down and no POP mail access to my ISP’s mail account if I’m traveling. (One reason I don’t use that e-mail account much, even for personal mail.) DSL would be marginally less expensive, but when I checked out DSL, the signal was so bad that it was slower than dial-up. Scratch that idea. (And even the best DSL connection is slower than cable.)

Because I do so much of my business online, the cost of a cable internet connection falls under the heading of “necessary, and therefore affordable.” Compared to the cost of a T1 connection, it’s pretty trivial Covad’s having a T1 special: only $259.50 per month for your first three months before the regular $599-$1799 price kicks in. (Pardon me while I give myself CPR for the heart attack reading those numbers induced. I knew it was expensive, but I had no idea.)

However, while I might put a T1 line down on the list of things I’d invest in if I started seriously raking in money, I don’t know a single home-office user who has one. I can’t, offhand, think of a home-office user who would need one. Not even the Ur-Guru. Unless, that is, he wanted to transfer the 2.7 TB he backs up weekly over the internet.

So I’m not going to lobby for T1 connections anyone can afford. But any online backup system, from a well-developed, enterprise-oriented service like LiveVault’s to the brand-new free service from Mozy (still in beta), requires a high-speed connection. And I know plenty of consultants who don’t have high-speed connections. (I know an embarrassing (for them) number of solo professionals with AOL accounts, too, but that’s a separate issue.) Either their businesses don’t rely heavily enough on the internet to make the cost of broadband worthwhile, or broadband simply isn’t available where they live.

The United States is dropping further and further down the list of countries with the greatest per capita broadband use. Korea has been top of the list since 2001, but this year the United Kingdom, formerly the nation with the lowest broadband penetration, has passed us up.

This is embarrassing, or would be if I had more national pride. I lived in Britain for four years in the mid-late Nineties. I loved it there, but couldn’t escape the sensation that where computers were concerned, the U.K. was five years behind the U.S. and at least two years behind the rest of Europe.

Nor is the size of our nation a decent excuse. Canada, which has a much smaller population in a far larger territory, has nearly twice as many broadband users as we do, and has done since at least 2001. (Figures taken from websiteoptimization.com.)

If I get into the possible reasons for this national broadband deficiency, I’ll be here for many hours and many pages, and this is already a long newsletter. Instead I’ll close with a call to action aimed at CEOs like Bob Cramer as well as the SOHO users who make up most of my own clients:

Lobby for broadband access. Without affordable broadband, 85% of the population is cut off from online backup solutions. (Make that affordable secure broadband—while I like the idea of free wi-fi as much as the next person, I’m not about to send my important business data over a café connection.)

And without broadband, these are your backup alternatives (as described by Mozy):

##################################
  • Burn a new CD or DVD every Sunday night and store them at your brother-in-law’s office like it's your religion.
  • Buy a $200 external hard drive and obsessively "push the button" and hope your office doesn't burn down.
  • Do nothing and don't worry about backup. (We suggest closing your eyes, plugging your ears and repeating "I'm in my happy place, I'm in my happy place".)
  • Run a cron job of rsync, gzip and mcrypt piped over ssh to your friend’s server over his DSL line.
#################################

And if you’re one of those doomed to dial-up, I’d recommend any of those options but the third. They might be more trouble than automated online backup, but they’re nowhere near as much trouble as losing your data.

Labels:

Friday, November 04, 2005

FileSlinger™ Backup Reminder 11-4-05: The Ur-Guru's Backups

The Ur-Guru has created his own automated backup routine to handle his frighteningly large array of computers (including several Virtual Machines). I asked him to describe his weekend backup process, and he said, “They might either jump out the window, fall asleep, or get confused after the first paragraph, I'm afraid. :-)”

Well, you might, at that. I find it fairly mind-boggling, myself. Visual aids would help, except that a map of the Ur-Guru’s network would cover an entire wall. I will, however, put a link to an image of a recent (but not current, as he’s been reconfiguring again) network map, and a photo of the monitor setup, on the blog.

For reference, Arena is the Ur-Guru’s main workstation. This is a man who almost never throws a computer away (his Amigas finally went to the old-age home a month or so ago), so he re-uses the workstations which are no longer up to speed for his software development work. Among those machines are Manticore, Nightshade, Meteor, Vortex, Whisper, and Moonlight. In addition he runs three servers: the “twins” Nova and Nebula, and his older server, Nexus.

So here it is, straight from the horse’s mouth. I’ve made a few minor grammatical and spelling corrections; English is not the Ur-Guru’s first language. (I think that was C++.)

It's not all that complex since most of it is entirely automated. First Nova and Nebula, the two servers, start up a process of imaging their boot disks (using each other as the storage location for the resulting image). They then proceed to create archive backups of all important data (which end up as ZIP files, incrementally). Then they synchronize data that needs to be identical on both and run a quality check to see if all the above went correctly. If at any point it didn't, they will restore the previous data and send off an e-mail to the administrator with the failure details. When all the backups and the images are created they are automatically copied over to a RAID5 array on Manticore.

During this time Arena, Manticore, and Meteor start doing their imaging process of the boot disks, storing them all on Manticore while Manticore stores its image to Arena. After this process the RAID arrays of Manticore and Arena perform a full synchronization of data, including the just recently created backup files.

This process will wait until Nitrous and Thunder (the laptops) are done synchronizing their data with the just freshly synchronized data on Manticore, while also taking partial synchronized data from Nova or Nebula (it figures out which of the two has the least to do and uses it to make sure no system is unduly stressed). If needed this process will also kick in a second synchronization process from Manticore to Arena and Meteor in case any files from the laptops are newer than the recently synchronized files.

Once all these processes are complete there are always two or more locations in which the backups of any system is available at any point in time. Manticore will proceed with a WOL (wakeup call) to Vortex which will power on and run a final check on all the backup processes that have completed. If any errors or failures are detected it will notify the administrator (me) via e-mail and loud annoying noises (just in case I fell asleep watching this boring but perfectly automated process).

If any files were changed on Whisper then Nova and Nebula would know about this. Vortex uses the information contained by Nova and Nebula to wake up Whisper to copy over its data to Manticore. The data from Whisper is separate from all other data.

Finally, Vortex will wake up Nexus and signal it to start a cleanup and redundancy process which involves removal of old archives and backups on itself and other systems in the network (in this process Nexus acts as a tertiary fail-safe for data that may be in error from a previous backup, allowing Nexus to directly restore any data to any system). When Nexus is done it will signal a power-off shutdown to Vortex after which it will shut itself down as well.

None of the Virtual Machines running on any of the systems are backed up. Their data is run through network locations on Nova and Nebula and that data was already part of the entire process so all I need to do for those is once in a while shut them down and just copy their disk images to a backup location. Similar to the PDA, a backup of it is done manually every time I feel like it or remember to do so.

The SQL server databases on Nova and Nebula keep eachother in sync so no additional backups are required, however, once a week a process runs that create a full backup of the databases which is later picked up during a weekend backup process as plain archives. These archives aren't rotated but are kept in storage until I manually delete the older ones (after I am absolutely certain no rollbacks of 2-3 months are needed).

The cost of all this: 2.7TB (2700GB) worth of backups and storage which runs over 2 separated gigabit network segments, completing the entire process in under 2 hours.

Who, me, paranoid?!

Now, aren’t you glad you only have your own machines to back up? Though I must say I envy him both the automation and the amount of storage space he has!

And it underscores the point that if you’re serious about computers, you need to be serious about backups.

Labels: