I’ve mentioned battery backups—that is, Uninterrupted Power Supplies which provide electricity when PG&E decides to stage a rolling blackout—once or twice in the past. This past week the Ur-Guru had an adventure with his, so I’ve asked him to write about it.
One of the things often overlooked when thinking about backups is an uninterruptable power supply (UPS). We back up our data (don’t we?) but usually don’t have to think about providing backup power to the computer system itself. Because I use a set of two systems to provide both frequent automated backups as well as several key services such as e-mail, web domains, etc., it’s essential that these systems do not suddenly lose their power.
Loss of power on a system has the possibility of corrupting data. If a system is actively writing data to disk, a loss of power can not only corrupt the data it was writing but even the entire file system. Bad news if it happens to a “mission critical” system.
To prevent that from happening I had both the (relatively small) systems powered by APC Back-UPS ES battery powered power strips. Heavy and bulky as they are these things are essential in allowing the systems to run on battery power during a black-out or other power/current malfunction while being protected from power surges. The 10 minutes these UPS’es provide the two systems is more than sufficient for the systems to shut down properly and wait for the main current to come back on (at which point they would power themselves back up again).
However, a UPS can break down. More importantly, the batteries in these things do not last. They need replacement every 2-3 years (3-5 if you believe the manufacturer, though I suspect those numbers are not based on 24/7 use).
A few days ago one of the APC Back-UPS devices decided it no longer liked me and started yelling at me through its audible alarm. Adding insult to injury it then decided to start flashing its little lights at me to express its utter dismay of me having completely forgotten to replace the battery that decided it had been worn out. Then in a final attempt at letting me know about its unhappy state it decided to just completely break down on me (a slight tap on the device being enough to turn the power on or off, definitely not an APC Back-UPS feature!).
Two days later the second one decided that the battery needed to be replaced (not even 3 years after initially buying and installing them). Except this UPS decided it wasn’t just unhappy but angry at me because instead of just sounding the beeping alarm and flashing the error lights it decided to temporarily, for about 5 seconds, pull the power from the server it was providing with current. Needless to say I’m not amused by devices that misbehave like that and considered it an attempt at intentional sabotage. I consider the act of pulling the current from my server and sending it into a straight reboot without a proper shutdown to be an act of war.
Since I had decided I wanted to start using a different machine as the main server it was a good time to get a completely new, different, and bigger UPS so I ended up ordering the APC SC1500i model (1500VA, 865 Watts), which arrived at my dealer within a few days. At close to 22kg in weight this was not the kind of device you happily carry back home. But after running some tests it is showing that it can power both of the servers for about 30-35 minutes before instructing them to shutdown. I hope this UPS behaves better than the previous two.
I would have expected the APC software, or the units themselves, to inform me when a battery would need to be replaced but alas, that never happened (even though it should), and as a result I was lucky to get away with a scare instead of a corruption on the system. But it’s a good idea to not rely on software notifications and just mark down and keep track of approximately when you will need to order a replacement battery. Having one as a backup long before you’re going to use it would be a waste since they’d only end up running out of warranty but getting a replacement when needed is no luxury either.
The problem, of course, with automated backups is that they run unattended and always cause disk read/write activity that could suffer horribly when the power is taken off unexpectedly. Another thing, if you’re in the US and suffer from what I call “third world cabling” then you may really want to consider a backup for your power. You very likely wouldn’t have to get something that you can’t reasonably carry but a simple and reliable UPS that will allow your system to shutdown properly might not be a luxury item depending on your area (or in anticipation of the return of Enron). Pulling the power from a system that is writing to disk can often be harmless but it’s like playing Russian Roulette with your ongoing file activity because for every dozen times it’s harmless there’s a decent chance of the next power loss being fatal to your data.
And don’t forget about those replacement batteries when it’s time!