An important aspect of a spreadsheet's fitness-for-purpose is ensuring that its run time is acceptable to the users. Creating an inefficient spreadsheet is easy, while creating an efficient spreadsheet requires a bit more thought and the application of some techniques to reduce run time.
This article presents an example of using refactoring techniques to vastly improve the run time of a spreadsheet. We focus on an especially slow feature of Excel: the interaction between VBA and the worksheets.
The example – which is available for download below – does a simple, but very repetitive, task using VBA.
The task is to:
- Start with a block of 100,000 random integers between 1 and 100 inclusive.
- Set an output block of 100,000 values equal to zero.
- Add one to each input value.
- Write the results to the output block.
- Use each output block cell in a subsequent calculation.
VBA methods to complete the task
To illustrate how different approaches to the task can lead to large differences in performance, the example spreadsheet uses two VBA methods:
- Method 1: A naive, but very common, approach to working with worksheet data – with VBA selecting and changing each cell individually, which then triggers a workbook recalculation at each step. This method is very inefficient. Using the
Application.ScreenUpdatingproperty would reduce the run time somewhat, but the method would still take a long time to complete the task.
- Method 2: A more sophisticated approach designed for maximum efficiency. It treats the data as an array, rather than as individual cells, and does the calculation in VBA using variables that have been properly declared.
Different methods result in an enormous difference in run time
The difference in run time between the two methods is enormous: on our testing PC, Method 1 takes 20 minutes to complete the task, while Method 2 takes a quarter of a second. That is, Method 2 is more than five thousand times faster than Method 1!
Depending on your computer environment your run times may differ from ours, but nonetheless we would expect a large difference in run times between the two methods irrespective of the computer you're using.
With just a few changes to the method, this example shows that it is possible to greatly improve the run time of your VBA code. The following sections show the code for the two methods.
Details of Method 1
The VBA code for Method 1 is shown below. It is called from another procedure, which handles the user interaction and calculation of the run time.
This code is a very inefficient approach to implementing the task. It selects each cell individually, switching back-and-forth between VBA and the worksheets, which is very slow. It also doesn't declare variables, which implicitly means that they are all of type
Variant (which is slower than explicitly declared variables).
Private Sub Method1() ' Typical method for interaction between VBA and worksheets. ' This method is very inefficient, as it fails to explicitly declare variables and ' then it loops over the input data by selecting each cell individually. ' Each step triggers a recalculation of the worksheets, which greatly increases the ' run time of the procedure. Worksheets("Input").Select NumRows = Range("Data").Rows.Count ' Adapt to the size of the input block NumCols = Range("Data").Columns.Count Worksheets("Output").Select For i = 1 To NumRows ' Loop over each cell in the output block For j = 1 To NumCols ' and set their values to zero Range("OutputTopLeft").Offset(i - 1, j - 1).Value = 0 Next j Next i For i = 1 To NumRows ' Loop over each cell in the input block, For j = 1 To NumCols ' perform the required function, and write Worksheets("Input").Select ' the results to the output block CurrCell = Range("DataTopLeft").Offset(i - 1, j - 1).Value NewVal = CurrCell + 1 Worksheets("Output").Select Range("OutputTopLeft").Offset(i - 1, j - 1).Value = NewVal Next j Next i Worksheets("Input").Select End Sub
Details of Method 2
The VBA code for Method 2 is shown below.
This code is much more efficient than Method 1. It works with the input data as an array, avoiding almost all the worksheet interaction. In addition to being more efficient, the code is also shorter and simpler.
Private Sub Method2() ' More efficient method for large scale interaction between VBA and worksheets. ' This method treats the data as an array, which is many times faster than working ' with cells individually. The calculation is also done in VBA, which avoids ' worksheet recalculations. Dim DataArray As Variant ' Local copy of input (must be Variant) Dim NumRows As Integer ' Number of rows in the data Dim NumCols As Integer ' Number of columns in the data Dim CurrRow As Integer ' Loop counter for rows Dim CurrCol As Integer ' Loop counter for columns DataArray = Range("Data") ' Get the data in a single step NumRows = UBound(DataArray, 1) NumCols = UBound(DataArray, 2) Range("Output").Value = 0 ' Set all cells in the output block to zero For CurrRow = 1 To NumRows ' Loop over each cell, perform the required For CurrCol = 1 To NumCols ' function, and write to the output block DataArray(CurrRow, CurrCol) = DataArray(CurrRow, CurrCol) + 1 Next CurrCol Next CurrRow Range("Output") = DataArray End Sub
Many techniques can be used to reduce the run time of a spreadsheet. The techniques that are appropriate in each case will depend on the characteristics of the spreadsheet. This example illustrates just a few of the techniques, so download the example and have a look at the code.
To find out more about how the reduce the run time of your spreadsheets, please contact us.
Download the example to see more about how it works
When you open the spreadsheet, you will need to enable macros before you can run the example. You may also need to make the file a trusted document.