Birgit Hofer & Franz Wotawa
Spreadsheets may be large, containing several thousand formulas, and thus they may be hard to comprehend and analyze. Unfortunately, they are also prone to errors.
Identifying the cells which are responsible for an observed error is time-consuming, tedious, and frustrating. Spectrum-based Fault Localization (SFL) helps users to faster identify those cells that have to be modified in order to eliminate any observed misbehavior. SFL requires information about the correctness of certain cell values, and users might wrongly classify such cell values. A misclassification may influence the outcome of SFL substantially.
In this paper, we investigate the influence of incorrect user information on the quality of SFL. In particular, we present a theoretical analysis of the impact of a misclassification on the Ochiai similarity coefficient and an empirical evaluation based on 33 spreadsheets with 218 faulty versions.
Spreadsheet errors are surprisingly common, even for spreadsheets of a trivial size. Assuming a 3% to 5% error rate, there is a 26% to 40% chance to make at least one mistake when classifying 10 cell values.
In this example, input cells are shaded in light blue, correct output values are given in green font, and faulty cells are shaded in red.
2015, Software Quality, Reliability and Security, August, pages 282-291