Spreadsheets are notoriously error-prone.
Cunha, et al (2011)
Despite being staggeringly error prone, spreadsheets are a highly flexible programming environment.
Abreu, et al (2015)
Spreadsheet errors are still the rule rather than the exception.
Nixon & O'Hara (2010)
The quality and reliability of spreadsheets is known to be poor.
Bishop & McDaid (2007)
Spreadsheets are commonly used and commonly flawed.
Caulkins, Morrison, & Weidemann (2008)
Spreadsheets are dangerous to their authors and others.
Durusau & Hunting (2015)
Developing an error-free spreadsheet has been a problem since the beginning of end-user computing.
Mireault (2015)
Untested spreadsheets are riddled with errors.
Miller (2005)
Spreadsheet errors have resulted in huge financial losses.
Abraham & Erwig (2007)
Never assume a spreadsheet is right, even your own.
Raffensperger (2001)
Errors in spreadsheets are as ubiquitous as spreadsheets themselves.
Colbenz (2005)
Spreadsheets are more fault-prone than other software.
Kulesz & Ostberg (2013)
Spreadsheets contain errors at an alarmingly high rate.
Abraham, et al (2005)
60% of large companies feel 'Spreadsheet Hell' describes their reliance on spreadsheets.
Murphy (2007)
Spreadsheets are extraordinarily and unacceptably prone to error.
Dunn (2010)
Despite overwhelming and unanimous evidence... companies have continued to ignore spreadsheet error risks.
Panko (2014)
Most executives do not really check or verify the accuracy or validity of [their] spreadsheets...
Teo & Tan (1999)
It is irrational to expect large error-free spreadsheets.
Panko (2013)
Research on spreadsheet errors is substantial, compelling, and unanimous.
Panko (2015)
...few incidents of spreadsheet errors are made public and these are usually not revealed by choice.
Kruck & Sheetz (2001)
Every study, without exception, has found error rates much higher than organizations would wish to tolerate.
Panko (1999)
The software that end users are creating... is riddled with errors.
Burnett & Myers (2014)
Overconfidence is one of the most substantial causes of spreadsheet errors.
Sakal, et al (2015)
Spreadsheets are often hard, if not impossible, to understand.
Mireault & Gresham (2015)
Spreadsheets have a notoriously high number of faults.
Rust, et al (2006)
The untested spreadsheet is as dangerous and untrustworthy as an untested program.
Price (2006)
It is now widely accepted that errors in spreadsheets are both common and potentially dangerous.
Nixon & O'Hara (2010)
The issue is not whether there is an error but how many errors there are and how serious they are.
Panko (2007)
Spreadsheet development must embrace extensive testing in order to be taken seriously as a profession.
Bock (2016)
People tend to believe their spreadsheets are more accurate than they really are.
Caulkins, Morrison, & Weidemann (2006)
A significant proportion of spreadsheets have severe quality problems.
Ayalew (2007)
Spreadsheets are the most popular live programming environments, but they are also notoriously fault-prone.
Hermans & van der Storm (2015)
Spreadsheet shortcomings can significantly hamper an organization's business operation.
Reschenhofer & Matthes (2015)
A lot of decisions are being made on the basis of some bad numbers.
Ross (1996)
Spreadsheets can be viewed as a highly flexible programming environment for end users.
Abreu, et al (2015)
Your spreadsheets may be disasters in the making.
Caulkins, Morrison, & Weidemann (2006)
Programmers exhibit unwarranted confidence in the correctness of their spreadsheets.
Krishna, et al (2001)
Spreadsheet errors are pervasive, stubborn, ubiquitous and complex.
Irons (2003)
Most large spreadsheets have dozens or even hundreds of errors.
Panko & Ordway (2005)
Even obvious, elementary errors in very simple, clearly documented spreadsheets are... difficult to find.
Galletta, et al (1993)
94% of the 88 spreadsheets audited in 7 studies have contained errors.
Panko (2008)
Spreadsheets are easy to use and very hard to check.
Chen & Chan (2000)
Studies have shown that there is a high incidence of errors in spreadsheets.
Csernoch & Biro (2013)
Spreadsheet errors... a great, often unrecognised, risk to corporate decision making & financial integrity.
Chadwick (2002)
Spreadsheets are alarmingly error-prone to write.
Paine (2001)
Spreadsheets... pose a greater threat to your business than almost anything you can imagine.
Howard (2005)
Errors in spreadsheets... result in incorrect decisions being made and significant losses incurred.
Beaman, et al (2005)
The results given by spreadsheets are often just wrong.
Sajaniemi (1998)
1% of all formulas in operational spreadsheets are in error.
Powell, Baker, & Lawson (2009)
Every study that has looked for errors has found them... in considerable abundance.
Panko & Halverson (1996)

Spreadsheet bibliography

Title Automated refactoring of nested-IF formulae in spreadsheets
Authors Jie Zhang, Shi Han, Dan Hao, Lu Zhang, & Dongmei Zhang
Year 2017
Type Article
Publication Preprint
Series December
Abstract

Spreadsheets are the most popular end-user programming software, where formulae act like programs and also have smells.

One well recognized common smell of spreadsheet formulae is nest-IF expressions, which have low readability and high cognitive cost for users, and are error-prone during reuse or maintenance. However, end users usually lack essential programming language knowledge and skills to tackle or even realize the problem.

The previous research work has made very initial attempts in this aspect, while no effective and automated approach is currently available.

This paper firstly proposes an Abstract Syntax Tree (AST)-based automated approach to systematically refactoring nest-IF formulae. The general idea is two-fold. First, we detect and remove logic redundancy on the AST. Second, we identify higher-level semantics that have been fragmented and scattered, and reassemble the syntax using concise built-in functions.

A comprehensive evaluation has been conducted against a real-world spreadsheet corpus, which is collected in a leading IT company for research purpose. The results with over 68,000 spreadsheets with 27 million nest-IF formulae reveal that our approach is able to relieve the smell of over 99% of nest-IF formulae. Over 50% of the refactorings have reduced nesting levels of the nest-IFs by more than a half.

In addition, a survey involving 49 participants indicates that for most cases the participants prefer the refactored formulae, and agree on that such automated refactoring approach is necessary and helpful.

Full version Available
Sample
Circos chart of the overlap between patterns
Circos chart of the overlap between patterns

We refactor nested-IF formulae using nine patterns and alternative functions:

  • Redundant. Logic redundancy is removed to simplify the nested-IF formula.
  • AND.
  • OR.
  • CHOOSE.
  • MATCH.
  • LOOKUP.
  • MAX/MIN.
  • IFS.
  • Useless. The IF function can be removed, as it is unnecessary.

This circos visualization shows the scale of refactored formulae for each pattern of nested-IF formulae (except MATCH, which does not overlap with any pattern).

Go to top