I’ve seen all of the hard drive crash statistics around the internet, several of which you reference on your site, however I’ve yet to find the source for any of the most startling. The one I’m most interested in, and hoping you can help me find the source of, is this:
“1 in 5 computers suffers a fatal hard drive crash in its lifetime.”
Do you know the source of this? Any similar statistics you do have sources for?
I couldn’t remember the source of that offhand, but I do try to cite sources when I know them, and the question is one worth asking.
Before you can accept the validity of statistics, it’s important to know where they come from. For instance, a couple of weeks ago I talked about the way manufacturers claim their drives have a “mean time to failure” of a million hours. Given that hard drives were only invented in 1956, no drive has actually lasted a million hours, so where do numbers like that come from?
From math of the sort I deliberately avoided, actually, and I won’t embarrass myself by trying to duplicate the equations here. You can read more about calculating Mean Time Between Failures (for things that can be repaired) and Mean Time To Failure (for things that can’t).
Bit-Tech.Net provides an explanation in English:
The MTBF is attained from running a large batch of drives, sometimes hundreds sometimes thousands, and measuring how often a drive fails. Given a batch of 2,000 drives, if one fails on average every 25 days, the drive would be given an MTBF of 1.2 million hours. Unfortunately, it isn’t quite as simple as that.
It’s usually the case that the tests are accelerated by altering the conditions, for example, increasing the temperature. The end result is highly dependent on these acceleration factors being accurate. The number of drives tested is also not standardised, so manufacturers are free to increase or decrease the amount of drives to attain the ideal MTBF rating.
Wikipedia adds that
Many manufacturers seem to exaggerate the MTBF of their products (e.g. Hard Drives), in order to either sell more product or sell for a higher price. A common way that this is done is to define the MTBF as counting only those failures that occur before the expected “wear-out” time of the device. Continuing with the example of hard drives, these devices have a definite wear-out mechanism as their spindle bearings wear down, perhaps limiting the life of the drive to five or ten years (say fifty to a hundred thousand hours). But the stated MTBF is often many hundreds of thousands of hours and only considers those other failures that occur before the expected wear-out of the spindle bearings.
But back to the question of where the scary statistics I sometimes quote come from. It’s usually storage manufacturers, backup service providers, or data recovery companies who commission studies about things like drive failure rates and the impact of data loss. The nature of the research may vary from something as informal as the poll on the home page of the FileSlinger™ Backup Blog (which has all of 68 respondents) to something as extensive as the recent Google study of more than 100,000 disks.
Here are some of the places I’ve found statistics about data loss and drive failures:
- ADR Data Recovery (Cites National Archives, ICSA Lab, Ontrack.)
- Data Deposit Box (Cites IDC, Cost of Downtime Survey, Harris Interactive.)
- Boston Computing Network (Cites Home Office Computing, Strategic Research Institute, Computer Economics.)
- Ontrack Data Recovery (Conducted their own “unscientific” survey of 1400+ computer users)
- Protect Data (Cites research from Ontrack)
- Imation Data Protection Survey
- The Diffusion Group
- Iron Mountain
But I can’t seem to find a specific source for the “1 in 5” quote, though it’s repeated all over the internet. I’m pretty sure I got it from the Data Deposit Box site, but they don’t mention where they got it, even though they cite sources for some of their other statistics.
I will say that the figure sounds right to me, based on my encounters with computers over the years. I’m typing this on my seventh computer, and none has yet suffered a drive failure, but as two are still alive, that means they still might. The Ur-Guru runs through several hard disks every year. I’ve encountered system failures as often as drive failures among clients, but I can think of three clients and two colleagues whose hard drives were beyond repair without even having to concentrate.
Of course, the nature of research is such that anything one study can “prove,” another study can “disprove.” You just need the right population sample.
If you know the source of that startling statistic, let me know and I’ll publish it on the Backup Blog!