I have been experimenting for a few weeks with a new package linking Excel to Python, by the name of ExcelPython. Here is what the author says about it:
“ExcelPython is a lightweight, easily distributable library for interfacing Excel and Python. It enables easy access to Python scripts from Excel VBA, allowing you to substitute VBA with Python for complex automation tasks which would be facilitated by Python’s extensive standard library.”
The main points of differentiation from PyXll are:
- It is open source and free for all applications (GNU Lesser General Public License)
- All routines must be called via VBA (examples will be given in a later code, and may also be viewed in the download file)
- The number of data-types supported is much more restricted; excel variant arrays being transferred as Python lists of lists.
- Documentation is also lightweight
- Being a non-commercial package, support will be limited, although to be fair, the author does seem to respond promptly to all queries on the discussion forum.
- Data is transferred to Python via COM, which allows data to be passed by reference.
- There are no facilities for setting up Excel Menus from Python code (so far as I know), and the Python docstrings are not transferred to the Excel function wizard.
I found installation to be straightforward with my Anaconda Python setup. The only problem I had was that if PyXll is installed, the add-in must be disabled before opening any file with ExcelPython code, otherwise an immediate crash results. To disable PyXll, go to File-Options-Addins, click the Manage button, and unselect PyXll from the list of add-ins. It may easily be re-enabled from the same menu, when required.
As an example of the use of Excel-Python, I have adapted the Matrixpyxll spreadsheet. The new file may be downloaded from MatrixXLpy.zip, including full open source code.
The functions included in the new file are:
The matrix functions are used in a similar way to the built-in Excel matrix functions, and provide identical results, but performance is very much better for large matrices.
Eigenvalue and Eigenvector functions are provided, which have no equivalent in Excel.
A variety of functions are provided to solve systems of simultaneous equations. The examples shown here on small data-sets provide identical results to use of the built-in MInverse and MMult functions, but provide much better performance, especially with large data sets.
Functions linking to the PySparse library provide much better performance with large sparse systems, and also allow the solution of systems far larger than can be accommodated by systems operating on the full matrix.
Run times for 8 different functions are shown below with 4 different systems of equations:
- A small 10 x 10 system, solved 1000 times.
- A 1000 x 1000 system, solved once
- A sparse 1000 x 1000 system
- A sparse 10945 x 10945 system (sparse solvers only, this system would be much too large for the solvers operating on the full matrix)
For the small and medium sized dense systems the standard solvers were found to be significantly faster than the sparse solvers, although it should be said that the code I used for setting up the sparse matrices was probably far from optimum. With the medium sized sparse matrix the sparse solvers were significantly faster than the others, which had almost identical run times to the dense matrices of the same size. The iterative solver was marginally faster than the others with the medium sized sparse matrix, but with the large sparse matrix it was 5-10 times faster than the other sparse solvers.
The data transfer time reported on the bottom row is the difference between the total run time in VBA and the run time of the Python code. Note that this is a significant part of the total, especially for the larger dense matrix which required the transfer of a matrix of 1 million doubles. To minimise this overhead the data transferred from Excel to Python should be minimised, with the matrices being created in Python, using the functions provided in the PySparse library where possible.
The next post in this series will look at the VBA code required to call the Python code, and also differences required in the Python coding, compared with that required when Pyxll is used.
Thanks for the interesting write-up. I can see the pros/cons of Excel-Python vs PyXLL. The former probably appeals more to those who’ve been doing VBA whereas PyXLL feels more *natural* if you come from the traditional C API/XLL background. There’s also a new one I’ve come across recently, called xlwings – looks similar to Excel-Python.
Definitely looking forward to the next post on this, would be interested to know what the performance difference between PyXLL and Excel-Python (having the COM overhead in mind). Thanks!
LikeLike
Thanks bjoern. Xlwings certainly looks interesting!
On the performance question, I’m just playing with the code that was most recently posted in the forum on the ExcelPython site. I’m getting a huge reduction in the data transfer overhead time, but I’m not sure why at the moment! I’ll certainly post something here when I have some numbers that are reasonably representative.
LikeLike
xlwings does indeed look very interesting… I remember coming across it a while ago, before even starting on ExcelPython I think, but I can’t remember why I decided to go ahead with my own interface anyway.
LikeLike
Interesting approach…very cool. Thanks for sharing.
LikeLike
Pingback: Python Matrix Functions with ExcelPython – 2 | Newton Excel Bach, not (just) an Excel Blog