Transferring large arrays with Pyxll

Pyxll is a commercial Excel add-in linking Excel and Python.  The latest version  offers greatly improved performance in transferring data between Excel and Python as Numpy arrays, amongst many other new features and improvements.

To check what this means in practice, I have checked the time to invert a large matrix (2000 x 2000 cells) in Python using various options:

  1. Calling a Python macro from VBA.  This passes data via Microsoft COM, which is inherently slower than the  other methods.
  2. Calling the Python function as a user defined function (UDF), with five different data types:
    1. A Python tuple of variant data type.
    2. A tuple of float type.
    3. A Numpy array defined as floats.
    4. A Numpy array of undefined data type.
    5. Input as a Numpy array, with output as an object cache.

Parts of the input and output arrays, and times for these six options are shown in the screen-shot below:

Performance for the first option is very slow, with the data transfer taking about 20 times longer than the much more complex and computer intensive task of inverting the matrix.

The first two UDF options, passing the data as Python tuples, shows a great improvement, with the total execution time reducing to less than 2 seconds.  The UDF passing the data as floats rather than variants was slightly faster, as would be expected.

The two UDFs passing data as Numpy arrays in both directions showed a further significant improvement, with the data transfer time being reduced by about half, for a total execution time of about 1 second.  There was not a significant difference between the two UDFs, which is as expected since the default data type for Numpy arrays is float.

The final UDF reduced the data return time to close to zero, with a total execution time of just over half a second.  The data returned to the spreadsheet is shown in the screen-shot below:

The return value of the UDF displays as ndarray@0, which is pointer to the full 2000 x 2000 array.  Results from the cached array may be displayed with a second UDF, which in this case has been set to display the first 2 rows.  The data return time is proportional to the number of rows displayed, with the full array taking about the same time as the Numpy array results.

In addition to the greatly improved performance when not all results are required in the spreadsheet, the cache object provides a simple and effective means to deal with large data sets exceeding 1 million rows.

I also started a check of the built-in Excel MInverse function with the same array used for the Python functions.  After six minutes it was still calculating (using just one processor), and I gave up.

For practical engineering analysis matrices for linear algebra problems may be much larger than 2000×2000, but will normally be sparse, with most of the elements being zero.  An example (taken from a 3D frame analysis of a large building structure) is shown below, with a matrix size of 14743 x 14743:

In spite of the much larger overall matrix size, the Scipy iterative sparse solver reduces the solution time to about 0.15 seconds, and the sparse input and single column output greatly reduce the data transfer time, with the UDF with Numpy arrays again giving by far the best results.

Posted in Arrays, Excel, Frame Analysis, Link to Python, PyXLL, UDFs, VBA | Tagged , , , , , , | Leave a comment

Scarborough Fair

Scarborough Fair is a very old song, according to Wikipedia dating back at least to 1670, and probably much earlier.  Here are three versions from the mid-sixties.

The first from Martin Carthy was released in 1965, on his first album:

The second,from Marianne Faithfull, was released in 1966 on the album North Country Maid:

The third, from Simon and Garfunkel, was released later the same year, and of the three is by far the best known:

Posted in Bach | Tagged , , , | 2 Comments

More updates to ArcSpline and IP2

I have uploaded version 2.18 of the IP2 spreadsheet to:

IP2.zip

The new version includes updates to the IP, IP_4, and ArcSpline functions, as described below.  The download is free and includes full open-source code.

The IP and IP_4 functions find all intersections of a pair of linear splines.  The previous version allowed for only up to one intersection point per segment.  This has now been modified to allow for the maximum number of possible intersections, as shown in the screen-shot below:

Other changes are:

  • The intersection point code has been modified to avoid  arithmetic problems with lines  that are very close to horizontal or vertical, or near parallel.
  •  The input ranges will be automatically extended or shrunk to include the columns of continuous data below the top cell of the selected range.

The ArcSpline code  has been modified so that it is no longer necessary to enter a sign for the radius of each arc.  All arcs may be input as positive, and the function determines the placement of the arc centre from the directions of the two tangent lines:

The example above shows the final point defined without a radius, and the CloseArc optional argument set to False.

The ArcSpline function (and many other functions in this spreadsheet) returns an array of variable size, depending on the number of arcs listed, and the number of segments for each arc.  The output listing may now be either automatically extended to show the complete array, or shrunk to a smaller range.

To reduce the size of the output range, select the desired range and press Ctrl-Shift-R:

To display the full results array, select the top-left hand cell, then press Ctrl-Shift-S.

The screenshot below shows a split screen after the output has been restored to its full extent, then the last point  of the input was deleted, and the CloseArc argument was deleted.  This results in the last line of the output displaying as  #N/A, since the output array length is now one line less.

Press Ctrl-Shift-S again, with the top left cell selected, and now only the extent of the output data is displayed:

 

Posted in Arrays, Coordinate Geometry, Excel, Maths, Newton, UDFs, VBA | Tagged , , , , , , , | Leave a comment

More SciPy Solvers

The xlwings Scipy spreadsheet has been updated with a new example of the xl_SolveF function, that uses the Scipy Optimize root function.  The new spreadsheet can be downloaded from:

xlScipy3.zip

The new example uses a python function ic_calc (included in the download file) to find the ultimate reaction forces and moment for a group of bolts with specified load eccentricity and angle, and yield displacement and force.  The calculation procedure is detailed at:
CE591eccentric_shear_F13.pdf

The shear force on each bolt is non-linear, depending on the bolt strength and shear displacement, and the displacement in each bolt depends on the position of the centre of rotation.  The Scipy root function adjusts the centre of rotation coordinates and the maximum bolt displacement so that the reaction forces and moments are equal to the applied forces and moment:

Results can be compared with a VBA spreadsheet doing the same calculation but using the Excel Solver, from Yakpol’s Spreadsheet Solutions for Structural Engineering.

Note that to generate the same result as my spreadsheet

  • The XY coordinates must be adjusted to give the same perpendicular eccentricity from the centroid of the bolts to the line of action of the force.
  • The force angle is the angle to the x-axis, whereas my function uses the angle to the Y axis.

The xl_SolveF function passes data to the function to solved as two arguments:

  • A 1D vector containing the variables
  • A single array containing all other data

An interface function is therefore used to extract the input parameters for the ic_calc function in the correct format, as shown below:

def ic_check(IC, vals):   
    xloc = vals[1][0]
    yloc  = vals[1][1]
    vals2 =vals[1][2]
    
    res1 = xlic_calc(IC,xloc, yloc, vals2)
    
    totX = res1[0]
    totY = res1[1]
    totM = res1[2]
            
    Pu =IC[2]
    Pux = res1[3]
    Puy = res1[4]
    Mu = res1[5]
    
    return [totX-Pux, totY-Puy, totM-Mu]

@xw.func
@xw.ret(transpose = True)
def xlic_calc(IC, xloc, yloc, vals):
    xloc = np.array(xloc)
    yloc = np.array(yloc)
    ecc =vals[0]
    theta = np.radians(vals[1])
    deltamax = vals[2]
    Rult = vals[3]
    num_bolts = len(xloc)
    
    ICx = IC[0]
    ICy = IC[1]
    Pu = IC[2]

    Pux = Pu * np.sin(theta)
    Puy = Pu * np.cos(theta)
    Mu = Pu * (ecc-ICx*np.cos(theta)-ICy*np.sin(theta))
    
    xIC = xloc - ICx
    yIC = yloc - ICy
    
    di = np.sqrt((xIC*xIC)+(yIC*yIC))
    dmax = max(di)
    
    deltai = di/dmax * deltamax
    ri = Rult * (1-np.e**(-10.0*deltai))**0.55
    fx = yIC * ri / di
    fy = xIC * ri / di
    moment = ri * di
    totX = sum(fx)
    totY = sum(fy)
    totM = sum(moment)
    
    return [totX, totY, totM, Pux, Puy, Mu]

 

Posted in Excel, Link to Python, Newton, NumPy and SciPy, UDFs, VBA, xlwings | Tagged , , , , , , , | Leave a comment

Solid Air – Norma Waterson

I don’t know how I missed this, but I just discovered yesterday:

More about the album: The Very Thought of You

On her second solo album the Grande Dame of British folk is presenting us with a surprising, inspired selection of songs. Most of the songs on this album come in pairs, and each has a story attached. Stories referring to Norma’s childhood memories or commenting contemporary events, most of them connected to famous personalities in the world of music; people as different and remarkable as Nick Drake, Judy Garland and Freddie Mercury. The mood is quiet yet lit by Norma’s powerful and passionately smooth voice, revealing new dimensions to well-known songs. Featuring Richard Thompson, Danny Thompson, Martin Carthy and Eliza Carthy.

Posted in Bach | Tagged , , , , | Leave a comment

Python Problems

I have recently returned to using Pyxll to link Excel to Python (of which more later), which required the installation of a 32 bit version of Python 3.7.  First trials after installation returned the message: “AttributeError: ‘module’ object has no attribute ‘CLSIDToPackageMap’“.

Fortunately someone else had had the same problem, raised a question on Stackoverflow, then answered his own question about 12 hours later:

After deleting C:\Temp\gen_py, the code above works again. Hope it can save trouble!

The same trick also worked for me, and is continuing to work with no further problems.

My gen_py folder was located at c:\Users\douga\AppData\Local\Temp\gen_py\, even though my Python installation was on my D: drive, and just in case anyone might be concerned about deleting a system folder, it contains the message: “# Generated file – this directory may be deleted to reset the COM cache…”

 

 

Posted in Computing - general, Link to Python, NumPy and SciPy | Tagged , , , | Leave a comment

Numerical Integration; Tanh-Sinh Quadrature v. 4.4

Update 27 Nov 2018: Version 4.41 now available from the link below, with minor modifications for compatibility with Excel 2007.

I recently received a new version (4.4) of the numerical integration spreadsheet by Graeme Dennes which is now available for download from:

Tanh_Sinh Quadrature.

The new spreadsheet includes significant improvements in the performance of several functions, as well as new functions:

The Tanh-Sinh quadrature workbook has been enhanced as follows:

(1) A faster Tanh-Sinh program has been implemented, increasing the speed by around 50 percent, and the speed of the DE programs has been doubled.

(2) A fast finite interval program TINT has been added. It runs at over twice the speed of the Gauss-Kronrod program.

(3) The speed of the Gauss-Kronrod program has been improved through modifications developed by Berntsen, Espelid and Sorevik.

(4) The Plotter worksheet now shows two plots: the plot of the selected function over the finite interval (a,b), and the plot of the function after being transformed by the Tanh-Sinh function.

(5) Now includes over 1200 test integrals with results correct to 15 significant digits. This may be the largest set of diverse test integrals and results available at no cost. It includes several of the “standard” sets of test integrals in wide use.

(6) The Romberg integrator, written by the author, may be the fastest and most accurate Romberg integrator available. Advice to the contrary would be most welcome.

(7) Minor change to allow compatibility with Excel 2007.

Graeme Dennes

 

Posted in Charts, Excel, Maths, Newton, Numerical integration, UDFs, VBA | Tagged , , , , , , , , | 9 Comments