Which shows you what I know about RAID.
The Ur-Guru advised editing my original post so I wouldn’t look so ignorant, but the truth is that when it comes to RAID and other advanced computing issues, I am ignorant. In case I haven’t mentioned it lately, I am not a Real Geek. (Well, yes, I am a geek, but not a computer geek. I’m a classicist by training, and we’re plenty geeky, but mostly involving things that happened a couple of thousand years ago.)
So instead I’m posting a correction.
I still don’t know what the exact details of this particular ongoing drama are (except that the company may have lost 3 years’ worth of transactions and other data), but as I thought back on what the business owner told me, it seemed to me that it wasn’t that half the array had been dead for months, but that they’d had to replace one of the disks a few months ago and now the second disk had gone and the RAID mirror hadn’t been mirroring.
The Ur-Guru (never short of opinions) responded to my conjecture with one of his own:
They had a functional RAID1 setup.
The mirror disk failed.
They replaced the mirror disk.
They had a cheapo (probably onboard) RAID controller that does not do an automatic rebuild so the replaced mirror disk never got mirrored properly (it was just a clean disk in there with no data).
Then the real disk failed.
Then they hoped the mirror would save them, but the mirror was never a mirror after the replacement.
A serious controller would’ve asked whether to rebuild the mirror or not, or would have been set to do it automatically. Once the disk was replaced with a different one that is the common thing it does. The cheap onboard controllers require more manual interference and… due to user error they never really had a mirror. I think that is the story.
I don’t know whether that’s the story in this particular case, but it’s certainly something that could happen and something that any readers who do have RAID arrays should be aware of. To reinforce that point, another Real Geek friend chipped in with
Most RAID controllers that I have used won’t let the machine boot if both drives aren’t working okay, so I don’t know what’s up with your client’s system. I dislike RAID, because only about half the time is it a drive failure that causes you grief. The other half, a virus erases some files, or you delete something by mistake, and the RAID controller obediently makes the same change to your mirrored drive. It gives you a false sense of security, and costs too much for the benefit you do receive.
(And I have no idea what was wrong with the backup system that was supposed to be copying the data on the moribund drive and why there was nothing, or nothing useful, on the tapes.)
Apparently, either none of my other readers noticed it was impossible for one disk in an array to be dead without affecting the others, or no one else actually read the reminder, because no one else said anything.
Which then leads me to wonder whether I shouldn’t just have kept my mouth shut in the first place.
Leave a Reply