i-nth - On the effectiveness of automatically inferred invariants in detecting regression faults in spreadsheets

Authors

Sohon Roy, Arie van Deursen, & Felienne Hermans

Abstract

Automatically inferred invariants have been found to be successful in detecting regression faults in traditional software, but their application has not been explored in the context of spreadsheets.

In this paper, we investigate the effectiveness of automatically inferred invariants in detecting regression faults in spreadsheets.

We conduct an exploratory empirical study on eight spreadsheets taken from VEnron and EUSES corpora. We apply automatic invariant inference to them, create tests based on the inferred invariants, and finally seed the sheets with faults.

Results indicate that the effectiveness of the inferred invariants, in terms of accuracy of fault detection, largely varies from spreadsheet to spreadsheet. The effectiveness is found to be affected by the formulas and data contained in the spreadsheets, and also by the type of faults to be detected.

Sample

Recall (%) of fault detection according to fault types

This figure shows the percentage recall for each fault type.

More generally, we examine two questions:

How accurate are inferred invariants in detecting regression faults in spreadsheets?
What factors affect the accuracy of inferred invariants in detecting regression faults in spreadsheets?

Results show that the accuracy, in terms of recall rate of fault detection, shows extreme variation from case to case. The accuracy depends on the type of spreadsheet formulas and data, and the type of faults.

Publication

2018, 18th IEEE International Conference on Software Quality, Reliability, and Security, July

Full article

On the effectiveness of automatically inferred invariants in detecting regression faults in spreadsheets