i-nth logo

Authors

Kelly Mack, John Lee, Kevin Chang, Karrie Karahalios, & Aditya Parameswaran

Abstract

In traditional usability studies, researchers talk to users of tools to understand their needs and challenges. Insights gained via such interviews offer context, detail, and background.

Due to costs in time and money, we are beginning to see a new form of tool interrogation that prioritizes scale, cost, and breadth by utilizing existing data from online forums. In this case study, we set out to apply this method of using online forum data to a specific issue — challenges that users face with Excel spreadsheets.

Spreadsheets are a versatile and powerful processing tool if used properly. However, with versatility and power come errors, from both users and the software, which make using spreadsheets less effective.

By scraping posts from the website Reddit, we collected a dataset of questions and complaints about Excel. Specifically, we explored and characterized the issues users were facing with spreadsheet software in general, and in particular, as resulting from a large amount of data in their spreadsheets.

We discuss the implications of our findings on the design of next-generation spreadsheet software.

Sample

Themes in the Reddit questions
Themes in the Reddit questions

The majority of the Reddit questions dealt with four themes relating to scalability problems: Importing Data, Managing Data, Querying Data, and Presenting Data. The remainder were placed in a Miscellaneous theme.

Every scalability post included a question and most received more than one suggested solution from other Reddit users.

The following solutions were common across the four themes:

  • Use a database.
  • Turn off automatic calculations.
  • Save Excel files as .XLSB files.
  • Use as little conditional formatting as possible.
  • Avoid using volatile functions.
  • Avoid using computationally intensive functions.

Conclusions:

  • Users understand the capabilities of Excel, but not how to operationalize them.
  • Scalability issues affect a wide range of operations in spreadsheets.

Publication

2018, International Conference on Human Factors in Computing Systems (CHI), April

Full article

Characterizing scalability issues in spreadsheet software using online forums