Data Transfer to Python – Update

As noted in the comments here, the main bottleneck in the Python matrix solver functions presented recently was not in the data transfer from Excel, but rather in the creation of the Numpy arrays for very long lists of short lists (see this Stackoverflow thread for more details of the background).  It seems there is a substantial time saving in converting the long array into 1D vectors, which can be converted into Numpy arrays very much more quickly.  The VBA code below converts a 3D array of any length (up to the maximum allowed in Excel) to 3 vectors.

Function Vectorize(x As Variant, x_1 As Variant, x_2 As Variant, x_3 As Variant) As Long
Dim nr As Long

nr = x.Rows.Count
x_1 = x.Resize(nr, 1).Value2
x_2 = x.Offset(0, 1).Resize(nr, 1).Value2
x_3 = x.Offset(0, 2).Resize(nr, 1).Value2
Vectorize = nr
End Function

To transfer these vectors to Python, via ExcelPython, the PyObj function must be used:

Set x = Range("SSA")  ' Excel range with 500500 rows and 3 columns
n = Vectorize(x, x_1, x_2, x_3) 'Convert range values to 3 vectors
' Create ExcelPython objects for transfer to Python
Set x_1 = PyObj(x_1, 1)
Set x_2 = PyObj(x_2, 1)
Set x_3 = PyObj(x_3, 1)

In Python the three vectors are converted to Numpy arrays:

def xl_getnpvect(x_1, x_2, x_3):
    timea = np.zeros(4)
    timea[0] = time.clock()
    x_1 = np.array(x_1)
    timea[1] = time.clock()
    x_2 = np.array(x_2)
    timea[2] = time.clock()
    x_3 = np.array(x_3)
    timea[3] = time.clock()
    return timea.tolist()

The table below compares the data transfer and conversion times using this method on an Excel range of 500500 rows x 3 columns, with the same operation using a 2D variant array.

XLpyMatrix3-0

The times for solution of a large sparse matrix system (10945×10945 matrix), using the new vector transfer routine, are shown below:
XLpyMatrix3-1

The data transfer and array creation times are now a relatively small proportion of the total solution time, even for the iterative solver with a solve time of only 0.28 seconds.

This entry was posted in Arrays, Excel, Frame Analysis, Link to Python, NumPy and SciPy, VBA and tagged , , , , , . Bookmark the permalink.

4 Responses to Data Transfer to Python – Update

  1. Your code looks good. Thanks for sharing.

    Like

  2. maurizio says:

    maybe it might be of help to force vba to pass data a ONE-dimensional array (as a list), and then
    make the reshape to right rows/columns in numpy (maybe also specifying the type and the
    column order).
    With a tweak, it is possible to do it WITHOUT copying the range, but simply forcing the array to think
    (temporarily) that it is one-dimensional. The following code does just this, and then restores the
    array:

    Option Explicit
    Option Base 1

    Declare Sub CopyMemory Lib “kernel32” Alias “RtlMoveMemory” (pvDest As Any, lpvSource As Any, ByVal cbCopy As Long)
    Declare Function VarPtrArray Lib “msvbvm60.dll” Alias “VarPtr” (Ptr() As Any) As Long

    Private Const VT_BYREF = &H4000&

    Public Type SAFEARRAYBOUND
    cElements As Long
    lLbound As Long
    End Type

    Public Type SAFEARRAY
    cDims As Integer
    fFeatures As Integer
    cbElements As Long
    cLocks As Long
    pvData As Long
    rgsabound(3) As SAFEARRAYBOUND
    End Type

    Function Rank(A) As Long
    Dim lp As Long, VType As Integer, sa As SAFEARRAY, n As Long
    Rank = 0
    If Not IsArray(A) Then Exit Function
    With sa
    CopyMemory VType, A, 2
    CopyMemory lp, ByVal VarPtr(A) + 8, 4
    If (VType And VT_BYREF) 0 Then
    CopyMemory lp, ByVal lp, 4
    End If
    CopyMemory .cDims, ByVal lp, 2
    Rank = .cDims
    End With
    End Function

    Function Make1Dim(A, m As Long)
    Dim lp As Long, VType As Integer, sa As SAFEARRAY, n As Long
    If Not IsArray(A) Then Exit Function
    With sa
    CopyMemory VType, A, 2
    CopyMemory lp, ByVal VarPtr(A) + 8, 4
    If (VType And VT_BYREF) 0 Then
    CopyMemory lp, ByVal lp, 4
    End If
    CopyMemory .cDims, ByVal lp, 16
    CopyMemory .rgsabound(1), ByVal lp + 16, 2 * Len(.rgsabound(1))
    m = .rgsabound(2).cElements
    n = .rgsabound(1).cElements
    .rgsabound(1).cElements = m * n
    .rgsabound(2).cElements = m * n
    .cDims = 1
    CopyMemory ByVal lp + 16, .rgsabound(1), 2 * Len(.rgsabound(1))
    CopyMemory ByVal lp, .cDims, 16
    End With
    Make1Dim = A
    End Function

    Sub reset2Dim(A, m As Long)
    Dim lp As Long, n As Long, VType As Integer, sa As SAFEARRAY
    If Not IsArray(A) Then Exit Sub
    With sa
    CopyMemory VType, A, 2
    CopyMemory lp, ByVal VarPtr(A) + 8, 4
    If (VType And VT_BYREF) 0 Then
    CopyMemory lp, ByVal lp, 4
    End If
    CopyMemory .cDims, ByVal lp, 16
    CopyMemory .rgsabound(1), ByVal lp + 16, 2 * Len(.rgsabound(1))
    .rgsabound(2).cElements = m
    .rgsabound(1).cElements = .rgsabound(1).cElements \ m
    .cDims = 2
    CopyMemory ByVal lp + 16, .rgsabound(1), 2 * Len(.rgsabound(1))
    CopyMemory ByVal lp, .cDims, 16
    End With
    End Sub

    Sub sho(r, A)
    Dim n As Integer
    n = Rank(A)
    Select Case n
    Case 0
    r = A
    Case 1
    r.Resize(1, UBound(A, 1)).Value2 = A
    Case 2
    r.Resize(UBound(A, 1), UBound(A, 2)).Value2 = A
    End Select
    End Sub

    Sub Test()
    Dim A, m As Long ‘ m saves the right no. of rows
    A = [B3:D6]
    sho [B10], Make1Dim(A, m)
    reset2Dim A, m
    sho [B13], A
    End Sub

    just put numbers in the range B3:D6, and run Test.

    Like

    • dougaj4 says:

      Hi Maurizio, thanks for the suggestion and the code.
      I had some problems with the code:
      After pasting into the VBE, and replacing the WordPress ” and ‘ with the standard versions, I still had three lines marked as errors:
      If (VType And VT_BYREF) 0 Then

      I have deleted the 0, and the code now seems to work. Does that make sense?

      I’m still not sure that I see the advantage of this approach though. With my code I get three 1D lists transferred to Python, which is quite convenient. I guess with a large rectangular array your way would be better though. Are there other advantages that I’m missing?

      Like

      • maurizio says:

        the original line had “If (VType And VT_BYREF) 0 Then” (i.e., not equal to zero), so your
        change is equivalent. In reality, I think that in the present context it’s even safe to delete those
        three lines. Regarding your question:
        1) as you guess, for a large dense matrix definitively there should be a speed advantage. In python it is quite easy to restore the dense matrix, something like (on the premise that i’m NOT a python expert)
        x = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
        A=np.reshape(asfortranarray(x,dtype=’f8′),(3,-1),order=’F’)
        (it makes a 3 x 2 matrix).
        2) Similarly, if you need three vectors, i think you can get them in the python side, and possibly the whole process might be faster than the three calls;
        3) This approach can give some speed advantage in any vba code, in cases where it is appropriate to process a matrix in vba with a single loop and a single index (add a scalar to the whole matrix, etc.) Something that it was already possible in the old vb6, just blanking the check on the array indexes and using just one index, but that is impossible in vba.

        Is it worth? difficult to answer, maybe it is an option to have just in case.

        A final comment on your interface to ExcelPython: it is perfect for spreadsheets formulas, as it works on the sequence
        range -> variant -> py object -> python processing -> py object -> variant
        but this is overkill for vba programmnig, where i’d like to have also internal routines like
        py object -> python processing -> py object
        in vba, just use
        dim A
        set A = ….. (py call)
        and then the object A is alive and it can be used as such in other vba/ExcelPython processing. I think this “double” interface might increase the appeal of the whole thing.
        Best regards, maurizio

        Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.