Arrays vs Collections vs Dictionary Objects (and Dictionary help)

A recent comment by Lori Miller at Daily Dose of Excel suggested that (for the particular application being discussed) the VBA Collection object was quicker than using scripting dictionaries.  Since I hardly ever use collections (other than the built-in ones), and recently I have made quite extensive use of dictionaries I thought I would investigate further. In the process I found a comprehensive article on the scripting dictionary, which I recommend to Microsoft to see how to write truly helpful help articles:

Using the Dictionary Class in VBA

Commenting on the differences between dictionaries and collections, this article says:

For relatively simple needs, such as identifying only the distinct items in a list, there is no advantage to using a Dictionary from a feature functionality perspective.  However, if you must:

  • Retrieve keys as well as the items associated with those keys;
  • Handle case-sensitive keys; and/or
  • Be able to accommodate changes in items and/or keys

then using a Dictionary object offers a compelling alternative to a Collection.
However, even for relatively simple needs, a Dictionary may offer a significant performance advantage, as suggested by the analysis at the end of this article.

So which is it? Are collections faster than dictionaries, or are dictionaries faster than collections?

The linked article finishes up with a downloadable benchmark spreadsheet addressing this very question, giving the following results:


suggesting a substantial speed advantage for dictionaries. But that example was using a data set of 10000 rows x 50 columns.  Reducing the number of rows to 10 (and increasing iterations from 20 to 2000) I get:

Collection vs Dictionary (small data set)

Collection vs Dictionary (small data set)

In this case the collection is more than twice as fast as the dictionary, so it seems that it all depends on the size of the data (but for a small data set it probably doesn’t matter anyway, unless you have a huge number of repetitions).

As for arrays, they don’t have the lookup speed of collections and dictionaries, but they do have a number of advantages:

  • They are easy to create in VBA with a single statement:
    (Arrayname = Range(“rangename”).Value2)
  • 2D arrays are in the format required for matrix arithmetic, and are convenient for other arithmetical operations.
  • They are easily transferable to other programming languages.

Update 3rd September 2013

Two links provided by Lori in the comments are well worth a look, including more detailed benchmarking results.

Charles Williams: VBA UDF shootout between Linear Search, Binary Search, Collection and Dictionary

Colin Legg: Count Distinct Or Unique Values – VBA UDF

I think the main conclusions are:

  • Overall the scripting dictionary has the best balance of speed and convenience.
  • For very large data sets collections can have a speed advantage
  • For the best performance with large data sets consider using a C++ xll function.
This entry was posted in Arrays, Excel, VBA and tagged , , , , . Bookmark the permalink.

7 Responses to Arrays vs Collections vs Dictionary Objects (and Dictionary help)

  1. Lori says:

    A couple of fairly recent MVP blog posts draw similar conclusions. Charles Williams analysis shows that dictionaries are quicker for most reasonable size data sets but collections are faster for small or very large datasets (500k rows). Colin Legg points out that collections use a case insensitive key and also that results depend on the uniqueness of data.

    Like

  2. David Hager says:

    I think you mean “For very SMALL data sets collections can have a speed advantage”.

    Like

  3. Pingback: Dictionary link | Newton Excel Bach, not (just) an Excel Blog

  4. Pingback: Fix Dictnry.hlp Errors - Windows XP, Vista, 7 & 8

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.