i-nth logo


Leslie Bradley & Kevin McDaid


Spreadsheets are ubiquitous in business with the financial sector particularly heavily reliant on the technology. It is known that the level of spreadsheet error can be high and that it is often necessary to review spreadsheets based on a structured methodology which includes a cell by cell examination of the spreadsheet.

This paper outlines the early research that has been carried out into the use of Bayesian Statistical methods to estimate the level of error in large spreadsheets during cell be cell examination based on expert knowledge and partial spreadsheet test data.

The estimate can aid in the decision as to the quality of the spreadsheet and the necessity to conduct further testing or not.


Prior and posterior probability estimates
Prior and posterior probability estimates

Suppose a spreadsheet contains 900 formula cells. This chart shows the likely number of error cells after 20 cells have been tested. The prior distribution shows that the most likely number of error cells is 110. The posterior distribution indicates that there are most likely 96 errors cells.


2009, EuSpRIG

Full article

Error estimation in large spreadsheets using Bayesian statistics