i-nth - Detecting and predicting evolution in spreadsheets: A case study in an energy network company

Authors

Bas Jansen, Felienne Hermans, & Edwin Tazelaar

Abstract

The use of spreadsheets in industry is widespread and the information that they provide is often used for decisions. Research has shown that spreadsheets are error-prone, leading to the risk that decisions are made on incorrect information.

Software Evolution is a well-researched topic and the results have proven to support developers in creating better software. Could this also be applied to spreadsheets? Unfortunately, the research on spreadsheet evolution is still limited. Therefore, the aim of this paper is to obtain a better understanding of how spreadsheets evolve over time and if the results of such a study provide similar benefits for spreadsheets as it does for source code.

In this study, we cooperated with Alliander, a large energy network company in the Netherlands. We conducted two case studies on two different set of spreadsheets that both were already maintained for a period of three years. To have a better understanding of the spreadsheets itself and the context in which they evolved, we also interviewed the creators of the spreadsheets.

We focus on the changes that are made over time in the formulas. Changes in these formulas change the behavior of the spreadsheet and could possibly introduce errors. To effectively analyze these changes we developed an algorithm that is able to detect and visualize these changes.

Results indicate that studying the evolution of a spreadsheet helps to identify areas in the spreadsheet that are error-prone, likely to change or that could benefit from refactoring. Furthermore, by analyzing the frequency in which formulas are changed from version to version, it is possible to predict which formulas need to be changed when a new version of the spreadsheet is created.

Sample

Evolution of size of the spreadsheet over several versions

This figure shows how the size of a sample spreadsheet grew over time.

Both the number of formulas and the number of non-empty cells grew markedly.

Much of the growth was due to new functionality being added to the model, based on requests from end-users for more detailed information.

Publication

2018, IEEE International Conference on Software Maintenance and Evolution (ICSME), September

Full article

Detecting and predicting evolution in spreadsheets: A case study in an energy network company