The speed of loops

Commenting on the Good Practice post, John Tolle pointed out that with a collection it was very much quicker to loop using a For each loop than iterating by index, using a for, next loop.  I did some simple tests on this, also looking at ranges and arrays, and found he was absolutely right.  The test consisted of looping through a collection, range or array with 600, 6000 or 60,000 items, 1000, 100, or 10 times, adding 1 to the value at each step, so there were 600,000 operations in all cases.  The results are summarised in the screen shot below:

Looping through ranges, arrays and collections

 The lessons from this exercise were:

  • Arrays were always faster than the equivalent operations with a collection, and very much faster than a range
  • Using a “For Each” loop on an array was 3 times faster than a “For, Next” loop.
  • Using a “For Each” loop on a collection was almost as fast as a “For Each” loop with an array.
  • Times for a “For, Next” loop with a collection were proportional to the square of the size of the collection.  For the largest collection (60,000 items) this form of loop was nearly 2000 times slower than the “For Each” loop.
  • Where possible use “For Each” loops, in preference to a “For, Next” loop.
  • If a “For, Next” loop is necessary, use an array in preference to a collection.
  • Avoid iterating through a range in any application where speed is important.

For those interested in the code, the spreadsheet can be downloaded from CheckLoopSpeed.xls

The macro checkloop is currently set up for 10 iterations of 60,000 cells. If you want to change the parameters you need to change the NumReps value and the UserRng extent (at two locations) in the code. Note that for the loops using a range I have used 1/10 the number of iterations, and multiplied the time by 10, so the NumReps value should be 1/10 the actual number you want.

On re-running I also found that the range times were about twice as fast as previously, but still very much slower than the array times, or the collection times with a for each loop.

This entry was posted in Arrays, Excel, VBA and tagged , , , , , . Bookmark the permalink.

8 Responses to The speed of loops

  1. JP says:

    I’m trying to chart your data but it the data for the For Next Collection for 60k rows looks off. Are you sure that number is correct?

    Like

  2. Michael says:

    Hi Doug –

    Are the units above seconds? and do the entries represent start to finish? What’s the relation of the second grouping of 3 to the first of 6? Titles look similar. Finally, what is represented on the slowest/fastest line?

    I know…too many questions.

    …mrt 😉

    Like

  3. dougaj4 says:

    Jimmy – yes the time for the For-Next is roughly proportional to the square of the number of items in the collection, so if the number of rows goes up by a factor of 10 the execution time per loop increases by a factor of 100.

    I’m going to post the spreadsheet; please let me know if you find any blunders, but I’m pretty sure the times are right.

    Michael – Yes the units are seconds for the first group of 6.

    The next group of 3 is the ratio of the time for a for-next loop divided by the time for a for-each loop, and the final line is the slowest time divided by the fastest time.

    Like

  4. Michael says:

    Hi Doug –

    So that’s “/” as in divides and not “/” as in “and/or”.

    Must be an Aussie thing. 😉

    …mrt (duh!)

    Like

  5. Sericanus says:

    Looking at your code, it appears that the timer for the collection part does not include the time it takes to build the collections variable. I.e., this section:

    For Each val In UserA
    UserCollctn.Add val
    Next val

    Since this would generally be a necessary step for most calculations involve collections, shouldn’t this be included in the calculation time of the two methods involving collections?

    Aside from that, a very instructive post – thank you!

    Like

  6. Charles says:

    Looking at the code, I don’t think you are comparing like with like: some of your loops involve reading and writing back but others don’t.
    If you convert them all to just a read the picture looks a bit different, although the for For/Next on a collection is still a disaster: my timings are
    1.457068865
    0.883316652
    0.048390529
    0.032332211
    72.67488084
    0.036354665

    Like

  7. dougaj4 says:

    Sericanus – The range and array loops got their data straight from the spreadsheet, so I thought it would be unfair to include the time for building the collection, but certainly the collection times would be slowed down a little if this was included.

    Charles – I had a look at the timings including a write back to the spreadsheet at the end of each inner loop, and I also switched off recalculation. This (for the 600 rows x 1000 iterations case) reduced the range times to about 25 seconds, and increased the array times to about 5 seconds. This indicates that it is the write part of the read/write cycle that really takes the time when interacting with the spreadsheet.

    For the collection the only way I could see to make it work was to write one item at a time, and that is still running (started about 5 minutes ago). Is their away to copy a collection to the spreadsheet in one operation? Other than writing it to an array first, which seems like cheating for this exercise.

    For your runs with the ranges, how did you increment the range values without writing back to the spreadsheet?

    Anyway I think the main lessons are clear:

    1. Avoid looping through a collection with a for, next loop.
    2. Write to the spreadsheet in large blocks of data wherever possible.
    3. If you do need to write small blocks of data multiple times, switch recalculation to manual before you do it.

    Like

  8. Pingback: Daily Download 33: Miscellaneous | Newton Excel Bach, not (just) an Excel Blog

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.