VBA vs Power Query vs Python

A recent post at Chandoo’s Excel Forum asked for a formula to extract all pairs of digits from a number with 4 digits.  This was soon extended to dealing with longer numbers, and the examples looked at here will work with at least all pairs of digits from a 14 digit number.

Examples and code shown below can be downloaded from:

Listcomb2.zip

Hui came up with a VBA UDF, making use of a dictionary:

Function Extract_Pairs(str As Variant, Optional sort As String = "None") As Variant

' Declare
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")

Dim i As Integer, j As Integer, k As Integer
 Const sep As String = ";"

'Loop through input string and add pairs to collection
For i = 1 To Len(str)-1
   For j = i + 1 To Len(str)
     ' Add item
    If Not (dict.Exists(Mid(str, i, 1) & Mid(str, j, 1))) Then
       dict.Add Mid(str, i, 1) & Mid(str, j, 1), 1
     End If
   Next j
Next i

If sort = "xlAscending" Then
   Set dict = SortDictionaryByKey(dict, xlAscending)
ElseIf sort = "xlDescending" Then
   Set dict = SortDictionaryByKey(dict, xlDescending)
End If

'Extract elements out of Collection into string
For Each key In dict.keys
   tempstr = tempstr & key & sep
Next

'Return string to function
Extract_Pairs = "{" & Left(tempstr, Len(tempstr) - 1) & "}"
End Function

See the download file (link above) for slightly modified version,  code for the SortDictionaryByKey function, and example output.

Lori came  up with a  lengthy on-sheet formula, and some examples of much shorter Python code (with a link to this blog):

>>> import itertools
 >>> list(set(itertools.combinations([1,2,3,4],2)))
 [(1, 2), (1, 3), (1, 4), (2, 3), (3, 4), (2, 4)]
 >>> list(set(itertools.combinations([5,6,6,8],2)))
 [(5, 6), (6, 8), (5, 8), (6, 6)]
 >>> list(set(itertools.combinations([5,7,7,7],2)))
 [(5, 7), (7, 7)]
 >>> list(set(itertools.combinations([7,7,7,7],2)))
 [(7, 7)]

I have used the examples above to write a short Python function (in two versions) that can be called from Excel, via xlwings:

import xlwings as xlw
import itertools

@xlw.func
@xlw.arg('num',numbers = int)
@xlw.ret(expand='table')
def listcombs(vals, num=2):
    vals.sort() 
    rtn = list(set(itertools.combinations(vals, num)))
    return sorted(rtn)
    
@xlw.func
@xlw.arg('num',numbers = int)
def listcombs2(vals, num=2):
    vals.sort() 
    rtn = list(set(itertools.combinations(vals, num)))
    return sorted(rtn)

The first version uses the xlwings table decorator to adjust the size of the output array. This is currently slow, and sometimes does not fully update, so the second version uses my VBA array function re-size macro.

Finally Peter Bartholomew posted a solution using Power Query.  A spreadsheet with open code and documentation can be downloaded from:

challenge-n-digit PQ.xlsx

Examples of the results using the different approaches are shown in the screen shots below:

Hui’s original VBA code returned all different 2 digit numbers, for instance both 23 and 32.  I  have modified  it to return only the lower value when two values have the same two digits, to be consistent with Peter Bartholomew’s results:

Output from Peter Bartholomew’s Power Query spreadsheet is shown below.  See the download link for details:

The Python function requires the input digits in separate cells, and returns the results as a multi-column array:

The Python functions will work on an input range of any length, and extract groups of up to 9 digits.  The output array re-sizes automatically if any of the input data is changed:

 

This entry was posted in Arrays, Excel, Link to Python, UDFs, VBA, xlwings and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.