i-nth logo

Authors

David Hoepelman

Abstract

Spreadsheets have a life-cycle similar to that of other software: they are inherited throughout an organization, are maintained by different users, and evolve over time to meet changing requirements. This leads to increased complexity and technical debt.

In software engineering, refactoring is used to combat these problems by improving software structure without altering behavior. This technique can also be applied to spreadsheets.

In this thesis we present an improved version of the spreadsheet refactoring tool BumbleBee, extended with six refactorings: Extract Formula, Inline Formula, Introduce Cell Name, Group References, Introduce Aggregate and Introduce Conditional Aggregate. The Inline Formula, Group References and Introduce Conditional Aggregate refactorings were not implemented before and Extract Formula and Introduce Cell Name improve upon previous implementations.

To support these refactorings and facilitate future spreadsheet research the formula parser used needed improvements. We implemented these improvements and released the result as the open-source software package XLParser, a stand-alone C# parser for spreadsheet formulas. XLParser was evaluated on more than a million unique formulas from industrial datasets, and successfully parsed 99.999%.

Sample

Overview of the refactoring process
Overview of the refactoring process

This is the standard way of implementing refactorings. The purpose of this thesis was to make a better parser for Excel formulas, as this would not only be very useful for implementing refactorings but would be beneficial to all future spreadsheet research projects.

Publication

2015, Master's thesis, Delft University of Technology

Full article

Tool-assisted spreadsheet refactoring and parsing spreadsheet formulas