|Title||Understanding data analysis workflows on spreadsheets: Roadblocks and opportunities|
|Authors||Pingjing Yang, Cheng Ti-Chung, Sajjadur Rahman, Mangesh Bendre, Karrie Karahalios, & Aditya Parameswaran|
|Publication||Proceedings of Workshop on Human-In-the-Loop Data Analytics (HILDA'20)|
Spreadsheets are widely used for data management and analysis by individuals and teams with varying degrees of programming expertise across a spectrum of domains.
While several papers have studied the prevalence of errors on spreadsheets and performed ethnographic studies on spreadsheet use, little is known about how spreadsheet users approach and address computational tasks on spreadsheets, especially on relatively large datasets.
To understand how users analyze data on spreadsheets, we conducted a study consisting of eight common analytical tasks, with thirty-two participants. Participants developed an execution strategy for each task and then attempted to operationalize this strategy within the spreadsheet system. From examining the study results and transcripts, we identified the successful and unsuccessful strategies participants adopted in addressing the tasks.
In general, we find that unsuccessful spreadsheet users had difficulties mapping spreadsheet models to their predetermined execution strategies, comprehending online help documents when trying to learn how to use new formulae, and identifying workarounds when confronted with roadblocks.
We identify opportunities to reduce barriers in computational task completion, including improvements to the spreadsheet interface and better training/educational methodologies and tools.
The figure shows a Sankey diagram summarizing how participants attempted a task.
Out of 24 participants, 6 participants gave an incorrect answer after performing their planned approaches. Among them, one participant used a different approach to achieve the correct result, while five participants gave up.
We identified three typical flows for participants when attempting to address tasks: