i-nth logo
95% of spreadsheets have errors

Really, that statistic needs to be repeated: 95% of spreadsheets have errors.

This is not just a random, made-up statistic. Extensive research into spreadsheets has consistently found that almost all spreadsheets have errors.

Large spreadsheets are more likely to have errors, meaning that they're almost certainly wrong.

Research and experience agree

Our assertion that 95% of spreadsheets have errors is based on:

How can a 95% error rate be true?

It seems surprising that a spreadsheet almost certainly has errors. So, how does that happen?

Most people think that they are very accurate when doing most activities – and they're right. Research has shown that, for a wide range of cognitive tasks, humans make an error in only 1% to 5% of tasks.

For example, typical error rates for simple, non-trivial activities are:

  • Type a short number: 1.0% (per number).
  • Grammatical errors: 1.1% (of words).
  • Simple arithmetic: 2.0% (of calculations).
  • Software development: 3.7% (per line of code).
  • Type 10 digits: 5.0% (per number).

The spreadsheet Cell Error Rate (CER)

Experiments in spreadsheet development have seen similar rates, with errors in 1% to 5% of cells – this is called the "Cell Error Rate" (CER).

Although a CER of 1% to 5% seems low, the cascading nature of spreadsheet calculations means that errors accumulate through the calculations down to the bottom-line results.

Most spreadsheets have hundreds or thousands of formulae. Even with a small probability of error in each formula, the accumulated probability that bottom-line results have errors is bound to be high. Even for a small spreadsheet, the accumulated probability of error tends towards 100%.

The probability that a spreadsheet has at least one error is shown in the following chart.

Spreadsheet error cascade model

The chart assumes that each cell has the same independent probability of containing an error. For example, if a spreadsheet has only 100 used cells (which is small), and assuming a moderate cell error rate of 3%, then the probability of at least one error in the spreadsheet is about 95%. This aligns well with experience.

The more used cells a spreadsheet has, the more likely it is that there are many errors. Even with a low cell error rate of 1%, a spreadsheet that has 1,000 used cells has a 97% probability of having at least 5 errors, and the expected number of errors is 10.

In general, errors seem to occur in a few percent of all cells, meaning that for large spreadsheets, the issue is how many errors there are, not whether an error exists.

Panko, What we know about spreadsheet errors

For more information about the modelling of cascading spreadsheet errors and accumulated error probabilities, see Calculation cascade: A common cause of catastrophe.