## Data Transfer to Python – Update

As noted in the comments here, the main bottleneck in the Python matrix solver functions presented recently was not in the data transfer from Excel, but rather in the creation of the Numpy arrays for very long lists of short lists (see this Stackoverflow thread for more details of the background).  It seems there is a substantial time saving in converting the long array into 1D vectors, which can be converted into Numpy arrays very much more quickly.  The VBA code below converts a 3D array of any length (up to the maximum allowed in Excel) to 3 vectors.

```Function Vectorize(x As Variant, x_1 As Variant, x_2 As Variant, x_3 As Variant) As Long
Dim nr As Long

nr = x.Rows.Count
x_1 = x.Resize(nr, 1).Value2
x_2 = x.Offset(0, 1).Resize(nr, 1).Value2
x_3 = x.Offset(0, 2).Resize(nr, 1).Value2
Vectorize = nr
End Function
```

To transfer these vectors to Python, via ExcelPython, the PyObj function must be used:

```Set x = Range("SSA")  ' Excel range with 500500 rows and 3 columns
n = Vectorize(x, x_1, x_2, x_3) 'Convert range values to 3 vectors
' Create ExcelPython objects for transfer to Python
Set x_1 = PyObj(x_1, 1)
Set x_2 = PyObj(x_2, 1)
Set x_3 = PyObj(x_3, 1)
```

In Python the three vectors are converted to Numpy arrays:

```def xl_getnpvect(x_1, x_2, x_3):
timea = np.zeros(4)
timea = time.clock()
x_1 = np.array(x_1)
timea = time.clock()
x_2 = np.array(x_2)
timea = time.clock()
x_3 = np.array(x_3)
timea = time.clock()
return timea.tolist()
```

The table below compares the data transfer and conversion times using this method on an Excel range of 500500 rows x 3 columns, with the same operation using a 2D variant array. The times for solution of a large sparse matrix system (10945×10945 matrix), using the new vector transfer routine, are shown below: The data transfer and array creation times are now a relatively small proportion of the total solution time, even for the iterative solver with a solve time of only 0.28 seconds.

This entry was posted in Arrays, Excel, Frame Analysis, Link to Python, NumPy and SciPy, VBA and tagged , , , , , . Bookmark the permalink.

### 4 Responses to Data Transfer to Python – Update

1. online excel training says:

Your code looks good. Thanks for sharing.

Like

2. maurizio says:

maybe it might be of help to force vba to pass data a ONE-dimensional array (as a list), and then
make the reshape to right rows/columns in numpy (maybe also specifying the type and the
column order).
With a tweak, it is possible to do it WITHOUT copying the range, but simply forcing the array to think
(temporarily) that it is one-dimensional. The following code does just this, and then restores the
array:

Option Explicit
Option Base 1

Declare Sub CopyMemory Lib “kernel32” Alias “RtlMoveMemory” (pvDest As Any, lpvSource As Any, ByVal cbCopy As Long)
Declare Function VarPtrArray Lib “msvbvm60.dll” Alias “VarPtr” (Ptr() As Any) As Long

Private Const VT_BYREF = &H4000&

Public Type SAFEARRAYBOUND
cElements As Long
lLbound As Long
End Type

Public Type SAFEARRAY
cDims As Integer
fFeatures As Integer
cbElements As Long
cLocks As Long
pvData As Long
rgsabound(3) As SAFEARRAYBOUND
End Type

Function Rank(A) As Long
Dim lp As Long, VType As Integer, sa As SAFEARRAY, n As Long
Rank = 0
If Not IsArray(A) Then Exit Function
With sa
CopyMemory VType, A, 2
CopyMemory lp, ByVal VarPtr(A) + 8, 4
If (VType And VT_BYREF) 0 Then
CopyMemory lp, ByVal lp, 4
End If
CopyMemory .cDims, ByVal lp, 2
Rank = .cDims
End With
End Function

Function Make1Dim(A, m As Long)
Dim lp As Long, VType As Integer, sa As SAFEARRAY, n As Long
If Not IsArray(A) Then Exit Function
With sa
CopyMemory VType, A, 2
CopyMemory lp, ByVal VarPtr(A) + 8, 4
If (VType And VT_BYREF) 0 Then
CopyMemory lp, ByVal lp, 4
End If
CopyMemory .cDims, ByVal lp, 16
CopyMemory .rgsabound(1), ByVal lp + 16, 2 * Len(.rgsabound(1))
m = .rgsabound(2).cElements
n = .rgsabound(1).cElements
.rgsabound(1).cElements = m * n
.rgsabound(2).cElements = m * n
.cDims = 1
CopyMemory ByVal lp + 16, .rgsabound(1), 2 * Len(.rgsabound(1))
CopyMemory ByVal lp, .cDims, 16
End With
Make1Dim = A
End Function

Sub reset2Dim(A, m As Long)
Dim lp As Long, n As Long, VType As Integer, sa As SAFEARRAY
If Not IsArray(A) Then Exit Sub
With sa
CopyMemory VType, A, 2
CopyMemory lp, ByVal VarPtr(A) + 8, 4
If (VType And VT_BYREF) 0 Then
CopyMemory lp, ByVal lp, 4
End If
CopyMemory .cDims, ByVal lp, 16
CopyMemory .rgsabound(1), ByVal lp + 16, 2 * Len(.rgsabound(1))
.rgsabound(2).cElements = m
.rgsabound(1).cElements = .rgsabound(1).cElements \ m
.cDims = 2
CopyMemory ByVal lp + 16, .rgsabound(1), 2 * Len(.rgsabound(1))
CopyMemory ByVal lp, .cDims, 16
End With
End Sub

Sub sho(r, A)
Dim n As Integer
n = Rank(A)
Select Case n
Case 0
r = A
Case 1
r.Resize(1, UBound(A, 1)).Value2 = A
Case 2
r.Resize(UBound(A, 1), UBound(A, 2)).Value2 = A
End Select
End Sub

Sub Test()
Dim A, m As Long ‘ m saves the right no. of rows
A = [B3:D6]
sho [B10], Make1Dim(A, m)
reset2Dim A, m
sho [B13], A
End Sub

just put numbers in the range B3:D6, and run Test.

Like

• dougaj4 says:

Hi Maurizio, thanks for the suggestion and the code.
I had some problems with the code:
After pasting into the VBE, and replacing the WordPress ” and ‘ with the standard versions, I still had three lines marked as errors:
If (VType And VT_BYREF) 0 Then

I have deleted the 0, and the code now seems to work. Does that make sense?

I’m still not sure that I see the advantage of this approach though. With my code I get three 1D lists transferred to Python, which is quite convenient. I guess with a large rectangular array your way would be better though. Are there other advantages that I’m missing?

Like

• maurizio says:

the original line had “If (VType And VT_BYREF) 0 Then” (i.e., not equal to zero), so your
change is equivalent. In reality, I think that in the present context it’s even safe to delete those
1) as you guess, for a large dense matrix definitively there should be a speed advantage. In python it is quite easy to restore the dense matrix, something like (on the premise that i’m NOT a python expert)
x = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
A=np.reshape(asfortranarray(x,dtype=’f8′),(3,-1),order=’F’)
(it makes a 3 x 2 matrix).
2) Similarly, if you need three vectors, i think you can get them in the python side, and possibly the whole process might be faster than the three calls;
3) This approach can give some speed advantage in any vba code, in cases where it is appropriate to process a matrix in vba with a single loop and a single index (add a scalar to the whole matrix, etc.) Something that it was already possible in the old vb6, just blanking the check on the array indexes and using just one index, but that is impossible in vba.

Is it worth? difficult to answer, maybe it is an option to have just in case.

A final comment on your interface to ExcelPython: it is perfect for spreadsheets formulas, as it works on the sequence
range -> variant -> py object -> python processing -> py object -> variant
but this is overkill for vba programmnig, where i’d like to have also internal routines like
py object -> python processing -> py object
in vba, just use
dim A
set A = ….. (py call)
and then the object A is alive and it can be used as such in other vba/ExcelPython processing. I think this “double” interface might increase the appeal of the whole thing.
Best regards, maurizio

Like

This site uses Akismet to reduce spam. Learn how your comment data is processed.