Python Matrix Functions with ExcelPython – 2

Following the previous ExcelPython post I have modified the VBA and Python code for improved performance and to simplify the Python coding.  The main change is that where Numpy Arrays are required in the Python code, these are now created in the VBA, as described here (see post number 9 from Eric Reynolds) and here (a more detailed and wide ranging tutorial).

The new spreadsheet and Python Code may be downloaded from MatrixXLpy.zip including full open source code.

I have also updated the performance benchmarks as shown below:

ExcelPython Matrix Solver Functions

ExcelPython Matrix Solver Functions

I have now separated out the data transfer times from the actual solution time. It can be seen that when operating on a full dense matrix the xl_Solve function is very much faster than any of the sparse functions, but on the 1000×1000 sparse matrix all of the functions performed in much the same time. With the large (10945×10945) sparse matrix (which is too large for the dense functions to handle) the xl_SpSolveCOO function (using the Scipy Library) was significantly faster than the PySparse based xl_SpSolveLU, but the iterative xl_SpSolveit function (also using the PySparse Library) was 5 times faster again. The PySparse based functions operate on the non-zero elements of the lower triangle of symmetrical matrices, allowing substantially larger problems to be solved than could be handled by the xl_SpSolveCOO function.

Coding for the xl_SpSolveCOO function is shown below:

Function xl_SpSolveCOO(SMatrix As Range, SVect As Range) As Variant
Dim Rtn As Variant
    Rtn = xl_Run2A("xl_spsolvecooL", SMatrix, SVect, modname2, 1)
    If TypeName(Rtn) = "String" Then
        xl_SpSolveCOO = Rtn
    Else
        xl_SpSolveCOO = WorksheetFunction.Transpose(Rtn)
    End If
    End Function

The Python code is called by the xl_Run2A function:

Function xl_Run2A(FuncName As String, VA_1 As Range, VA_2 As Range, Optional Modname As String = modname1, Optional RtnDims As Long = 2) As Variant
Dim methods As Variant, result As Variant, VA_1py As Variant, VA_2py As Variant, NumpyArray As Variant, Result_List As Variant
    On Error GoTo rtnerr

    Set NumpyArray = PyGet(PyModule("numpy"), "array")
    Set VA_1py = PyCall(NumpyArray, , PyTuple(VA_1.Value2))
    Set VA_2py = PyCall(NumpyArray, , PyTuple(VA_2.Value2))

    Set methods = PyModule(Modname, AddPath:=Path1)
    Set result = PyCall(methods, FuncName, PyTuple(VA_1py, VA_2py))
    Set Result_List = PyCall(result, "tolist")
    xl_Run2A = PyVar(Result_List, RtnDims)
    Exit Function
rtnerr:
    xl_Run2A = Err.Description

End Function

The associated Python code is:

import numpy as np
import scipy.sparse as ssp
import scipy.sparse.linalg as sspla
from pysparse import spmatrix, superlu, precon, itsolvers
import linalgfuncs as laf
import time

def xl_spsolvecooL(m, x):
    """
    Solve a matrix equation Mx = y in sparse COO format, input from array.
	m: three column array of: row indeces, column indeces, values.
	x: column x vector
    """
    i, j, v, x = list2ijvx(m, x)
    n = x.size
    A = ssp.coo_matrix((v,(i,j)),shape=(n,n))
    return xl_spsolvecoo(A, x)

def list2ijvx(m, x):
    """
    Convert 3 column array, m, to 3 vectors, and single column array, x, to vector
    """
    v = m[:,2]
    i = m[:,0]
    j = m[:,1]
    x = np.array(x)
    x = x[:,0]
    return i, j, v, x

def xl_spsolvecoo(A, x):
    """
    Solve a matrix equation Ax = y in sparse COO format.
    A: Sparse array in COO format.
    x: column x vector
    """
    A = A.tocsr()
    A = sspla.spsolve(A, x)
    return A

 

This entry was posted in Arrays, Excel, Link to Python, Maths, NumPy and SciPy, UDFs, VBA and tagged , , , , , , . Bookmark the permalink.

14 Responses to Python Matrix Functions with ExcelPython – 2

  1. Easier to understand and useful. This is one of the nice things a like about ExcelPython. Thanks for sharing.

    Like

  2. kalx says:

    Thanks for posting the data transfer times.

    Like

    • dougaj4 says:

      I was surprised that the actual solver time was such a small proportion of the total, especially for the 1000 x 1000 matrix. On the other hand it’s a bit misleading because in a real application it would make sense to generate the matrix in Python (or whatever language is being used for the solver), rather than transferring the whole thing from VBA.

      But it does show that there is real scope for overall speed improvement by focussing on the matrix generation code.

      Like

      • kalx says:

        This is why I love engineers. They measure things and look at the results. Wishful thinking won’t keep a bridge from collapsing. Not many professions have such a close connection to the physical world.

        Like

  3. dougaj4 says:

    I can only agree 🙂

    Like

  4. Hello Doug,

    the link to the tutorial is now broken unfortunately as I have committed to the repository. You can replace it with the following URL which is kind of like a permalink:

    https://github.com/ericremoreynolds/excelpython/blob/c0c4d642dfed45c0e8d10e20d7858ac084c83b42/Tutorial/4.%20Ranges%2C%20VBA%20arrays%20and%20NumPy%20types.md

    Regards,

    Eric.

    Like

    • dougaj4 says:

      Thanks Eric, the link is now updated,

      Incidentally, I have just discovered that the major delay in getting the spreadsheet data into the fast solve routines does not occur in the transfer from VBA to Python, the really slow part is converting a long Python list of lists into a numpy array. I’ll post more details when I’ve worked out the best way round it.

      Like

  5. dougaj4 says:

    The lists I’m working with at the moment are up to 200,000 rows. For generating the sparse matrices the PySparse library has a function to generate a sparse matrix direct from a list of lists, but the Scipy sparse matrix routine needs 3 numpy 1D arrays. The quickest way I have found so far is to copy the three column long array into three separate row arrays with a For loop, then transfer those to Python as 1D lists, which will convert to 1D Numpy arrays quickly. I can generate single column arrays from the 3 column array in one operation in VBA with the Offset function, but these are transferred to Python as a long list of single item lists, which convert to Numpy arrays very slowly. Another option is to use the Worksheetfunction.Transpose function, but that doesn’t work on long arrays.

    For my purposes it would be better if single column arrays from VBA were converted to single lists, as for single row arrays.

    Like

  6. dougaj4 says:

    Eric – yes, PyObj does what I want. The data transfer time is now down to about 5-10% of the total solution time, which is much healthier. I’ll do a post on it in the next few days.

    Like

  7. Pingback: Data Transfer to Python – Update | Newton Excel Bach, not (just) an Excel Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.