Listing Python modules and getting help docs from Excel

Python functions include detailed help documentation but to access this you need the full path to the function, including the names of all code modules and submodules. This post looks at how this information can be found using Excel with pyxll, using the Scipy and Numpy libraries as examples.

As well as pyxll the code requires the inspect, importlib, pkgutil, and numpy libraries:

from inspect import getmembers, isfunction, getdoc, ismodule, signature
import importlib as imp
import pkgutil

import numpy as np

from pyxll import xl_func, xl_arg, xl_return

Example code samples for listing functions from modules include the use of the functions inspect.getmembers and pkgutil.iter_modules. The code below allows either of these functions to be used, and returns either the full output, or just the names of listed functions:

@xl_func
def get_modlist(modname, out=1, out2 = 1):
    mod = imp.import_module(modname)
    if out == 1:
        memb = getmembers(mod, ismodule)
        namelist =  [submod[0] for submod in memb if ismodule(submod[1])] 
    else:
        memb = list(pkgutil.iter_modules(mod.__path__))
        namelist =  [submod[1] for submod in memb if submod[2] == True  ]
    if out2 == 1:
        return memb
    else:
        return namelist

This function was found to give different results with Numpy and Scipy. Using getmembers on the top level Scipy module returned no results, but Numpy returned all available module names:

Using pkgutil.iter_module returns the available modules in Scipy and are indicated with TRUE in the third column, but for Numpy all the modules are indicated as FALSE:

The code below uses pkgutil.iter for the top level Scipy module, and getmembers for all other cases:

@xl_func
def get_modules(modname):
    try:
        mod = imp.import_module(modname)
    except:
        return []
    if modname == 'scipy': 
        memb = list(pkgutil.iter_modules(mod.__path__))
        namelist =  [submod[1] for submod in memb if submod[2] == True  ]
    else:
        memb = getmembers(mod, ismodule)
        namelist =  [submod[0] for submod in memb] 
    return namelist

The screenshot below shows this function displaying scipy modules and 5 levels of submodule:

For any selected submodule all the available functions can be listed with the get_funcs function:

@xl_func
def get_funcs(modname,  searchstring =''):
    lentxt = len(searchstring)
    mod = imp.import_module(modname)
    memb = getmembers(mod)
    namelist =  [func[0] for func in memb if ((isfunction(func[1]) or type(func[1]) == np.ufunc) and func[0][0:lentxt] == searchstring)]
    return namelist

This function will display all functions included in the sub-module, or optionally a search string may be used:

From the list of functions, one can be chosen to display the built-in help with the Get_Docs function:

@xl_func
@xl_arg('afunc', 'str')
@xl_arg('modname', 'str')
@xl_return('numpy_column<str>')
def get_docs(afunc, modname):
    try:
        mod = imp.import_module(modname)
        afun = getattr(mod, afunc)
        doc = getdoc(afun).split('\n')
    except:
        doc = ''
    return np.array([doc])

With recent versions of Excel this function will return a dynamic array, automatically resizing to display the full extent of the text:

The code and spreadsheet may be downloaded from:

PythonDocs.zip

Posted in Excel, Link to Python, Newton, NumPy and SciPy, PyXLL, UDFs | Tagged , , , , , , , | Leave a comment

Pint, MPmath and implied units, working with Excel

Spreadsheets linking to the Python Pint and MPmath libraries have been presented here before at:

Units and solvers with Pint and Sympy

mpmath for Excel

I have now updated the spreadsheet to work with pyxll, and with some new functions and examples. The spreadsheet and associated Python code can be downloaded from:

EvalU.zip

New functions include:

py_Quant creates a Pint Quantity object, which may be conveniently used to convert between any compatible units:

py_AddUnits adds units to the Pint Unit Registry, and py_UnitDefined checks if a unit yet exists:

A new example has been added, illustrating how to work with formulae that have constants with implied units, using as an example finding the tensile strength of concrete based on a constant times the concrete’s compressive strength:

Posted in Concrete, Excel, Link to Python, Newton, PyXLL, UDFs | Tagged , , , , , , , , | Leave a comment

3D Matplotlib Plots in Excel

As well as Excel, the code shown in this post requires current versions of Python, Numpy, Matplotlib, and pyxll. The required import statements at the head of the Python code are:

import numpy as np

import matplotlib as mpl
import matplotlib.pyplot as plt
import mpl_toolkits.mplot3d.axes3d as axes3d
from matplotlib import colors

import pyxll
from pyxll import xl_func, xl_arg, xl_return, plot

The spreadsheets and Python code described below may be downloaded from Matplotlib3D.zip.

The last post on creating Matplotlib animations in Excel had examples of 3D plots which either plotted a single line, or a single surface defined by points on a regular grid. The code below is a simple example of the latter, using test data included in the Matplotlib library:

@xl_func
@xl_arg('rtn', 'int')
@xl_return("numpy_array")
def PlotWireFrame(val = 0.05, rtn = 2):
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    ax.set_title('Wireframe Plot')

    #getting test data
    X, Y, Z = axes3d.get_test_data(val)

    #drawing wireframe plot
    cb = ax.plot_wireframe(X, Y, Z, cstride=10,
                            rstride=10, color="green")

    plot(cb)
    dat = X, Y, Z
    return dat[rtn] 

This generates the graph below when called from Excel:

The Excel PlotWireFrame UDF generates the wireframe graph and also returns the data for the selected axis.

The plot_wireframe function plots a 3D surface, but for my purposes I more frequently need to plot a 3D frame with the following features:

  • The plot should allow for a large number of straight line segments with any orientation and connections.
  • It should be possible to plot different groups of elements with different colours.
  • All three axes should be plotted to the same scale.
  • It should be possible to specify the viewpoint angles and the centre and extent of the plotted image.

The code below performs this task. The input data is specified in two ranges:

  • The lines are specified as separate straight line segments with a “material” number, start node number and end node number.
  • The nodes are specified in a 3 column range with X Y and Z coordinates.

The code combines all the lines of the same material into 3 arrays with X, Y and Z coordinates. Each line is specified with the start and end coordinates, then None, so that lines that are not connected are plotted with a gap between them:

@xl_func
@xl_arg('Nodes', 'numpy_array')
@xl_arg('Beams', 'numpy_array<int>')
@xl_arg('CenXYZ', 'numpy_array', ndim=1)
@xl_arg('ViewAng', 'numpy_array', ndim=1)
@xl_arg('DisplayAx', 'bool')
@xl_arg('Xrange', 'float')
@xl_arg('LineWidth', 'float')
def Plot3D(Nodes, Beams, ViewAng = None, CenXYZ = None, Xrange = 0., DisplayAx = False, LineWidth = 1):
    fig = mpl.figure.Figure()
    ax = axes3d.Axes3D(fig)
    
    ax.set_box_aspect([1, 1, 1])
    ax.set_proj_type('persp')

    # Set viewing angle if specified, or create axes with defaults
    try:
        ax = axes3d.Axes3D(fig, azim = ViewAng[0], elev = ViewAng[1]) # roll = ViewAng[2] to be added in next release
    except:
        ax = axes3d.Axes3D(fig)
    
    # Set axis limits
    if Xrange == 0:
        Xrange = (np.max(Nodes[:,0]) - np.min(Nodes[:,0]))/2
    rng = Xrange/2
    
    try:
        X, Y, Z = CenXYZ[0:3]
    except:
        X = (np.max(Nodes[:,0]) + np.min(Nodes[:,0]))/2
        Y = (np.max(Nodes[:,1]) + np.min(Nodes[:,1]))/2 
        Z = (np.max(Nodes[:,2]) + np.min(Nodes[:,2]))/2    
    
    ax.set_xlim3d(X-rng,X+rng)
    ax.set_ylim3d(Y-rng,Y+rng)
    ax.set_zlim3d(Z-rng, Z+rng)
    
    # Read beams and coordinates
    rows = Beams.shape[0]
    mats = np.max(Beams[:,0])
    nummata = np.zeros(mats, dtype = int)
    for i in range(0, mats):
        nummata[i] = np.count_nonzero(Beams[:,0] == i+1)

    colors =['tab:blue', 'tab:orange', 'tab:green', 'tab:red', 'tab:purple', 'tab:brown', 'tab:pink', 'tab:gray', 'tab:olive', 'tab:cyan']
   
    if DisplayAx == False:
        ax.set_axis_off()

    # Create lines for each material
    for mat in range(0, mats):
        rowsm = nummata[mat]    
        rows2 = rowsm*3
        x_line = np.empty(rows2)
        y_line = np.empty(rows2)
        z_line = np.empty(rows2)
        matcol = np.mod(mat, 10)
        col = colors[matcol]
        j = 0
        for i in range(0, rows):
            if Beams[i, 0] == mat+1:
                n1 = Beams[i,1]-1
                n2 = Beams[i, 2]-1
                x_line[j] = Nodes[n1,0]
                x_line[j+1] = Nodes[n2,0]
                x_line[j+2] = None
                y_line[j] = Nodes[n1,1]
                y_line[j+1] = Nodes[n2,1]
                y_line[j+2] = None
                z_line[j] = Nodes[n1,2]
                z_line[j+1] = Nodes[n2,2]
                z_line[j+2] = None
                j = j+3
        
        ax.plot3D(x_line, y_line, z_line, col, linewidth=LineWidth)
    plot(ax)

The code was checked by plotting three circles centred at the origin and with radius 10, in the XY, XZ, and YZ planes:

The default plot looks along the X axis with the Y axis to the right, and the Z axis vertical. In the next plot the view point is rotated 45 degrees about the Z axis (azimuth), with a vertical deflection of 30 degrees:

The next example shows a much more complex plot; a 3D image of the Sydney Harbour Bridge with the axes display turned off and the line width reduced to 1:

The data for this image was taken from a Strand7 file, available from the Strand7 web site to licenced Strand7 users. The top of the data range is shown below. In all there are 11,473 beams and 7737 nodes.

The next example shows the same data with a different view angle and centre coordinates:

Finally the same data viewed in the Strand7 FEA program, showing a very similar image to the Matplotlib/Excel plot with the same view angles:

The current Matplotlib code does not allow for rotation about the line of sight (“roll”), but this feature is under development and is expected to be included in the next release.

Posted in Charts, Charts, Coordinate Geometry, Drawing, Excel, Link to Python, Newton, NumPy and SciPy, PyXLL, UDFs | Tagged , , , , , , | Leave a comment

Free and Simple Tools for Editing Videos

I recently needed to extract 15 minutes from a 45 minute long video, in 5 separate clips. I discovered that there are free tools to do this simply and efficiently, although they are not widely advertised. Here’s how:

By default, videos on my computer open with the Windows 10 “Film & TV” app, which has an icon in the bottom right corner to “Edit with Photos”. Clicking that lists several options, including Trim:

Alternatively the video can be opened directly in Photos, which has an Edit-Trim icon at top right:

Either option opens the video in the Photos Editor, where the section of the video to be trimmed can be selected by moving the start and end sliders at the bottom of the window. Then click Save-as at the top right:

This process is repeated for each clip by re-opening the original video, in my case producing 5 video clips. These can be simply combined into a single video with the Adobe online Merge Videos page:

https://www.adobe.com/express/feature/video/merge

Posted in Bach, Computing - general, Newton | Tagged , , , | 2 Comments

Python functools – cache and lru_cache

Update 3rd April 2022: Following the comment from Larry Schuster I have modified the code so that the factorial0 function actually uses cache as intended, rather than lru_cache. I also added code to clear the cache after each timer run, because the uncleared cache was giving misleading results, and added a timer function to call each of the four factorial routines from within Python, rather than calling them as separate UDFs from Excel. The example timer results have all been updated for the new code.

Functools is a built in Python module “for higher-order functions: functions that act on or return other functions”. Of the collection of functions, the two that seemed the most obviously useful for my purposes were cache and lru_cach:

@functools.cache(user_function)
Simple lightweight unbounded function cache. Sometimes called “memoize”.
Returns the same as lru_cache(maxsize=None), creating a thin wrapper around a dictionary lookup for the function arguments. Because it never needs to evict old values, this is smaller and faster than lru_cache() with a size limit.

@functools.lru_cache(user_function)
@functools.lru_cache(maxsize=128typed=False)
Decorator to wrap a function with a memoizing callable that saves up to the maxsize most recent calls. It can save time when an expensive or I/O bound function is periodically called with the same arguments.

As a simple example, I have used a recursive function to find the factorial of the passed argument, with alternative versions with the cache and lru_cache decorators, and also the Numba just in time compiler decorator:

def factorial(n):
    return 1 if n <= 1 else n * factorial(n - 1)

@cache
def factorial0(n):
    return 1 if n <= 1 else n * factorial0(n - 1)

@lru_cache()
def factorial1(n):
    return 1 if n <= 1 else n * factorial1(n - 1)

@njit(cache = True)
def factorial2(n):
    return 1 if n <= 1 else n * factorial2(n - 1)

These functions have been called from Excel using pyxll, via timer functions allowing a specified number of repetitions, and returning the factorial result and the execution time for each function. One timer function called a specified factorial function, and was entered separately in Excel for each of the factorial functions. The other called all four factorial functions from within Python.

For 1 repetition with a low input number the cache and Numba functions were all slower than plain Python when called as UDF’s and slightly faster when called from Python:

Increasing the input value to 100 made the cache functions relatively slower, since the cache was updated at each step of the factorial calculation, but never actually used. The Numba function was faster than plain Python, especially when called from within Python, with the speed improvement being due to compilation of the code, rather than use of the cache.

With 10,000 repetitions and input value of 100 the cache functions are over 100 times faster than plain Python, with little difference in the time when called from Excel or Python. The Numba function was only 26 to 27 times faster than plain Python:

In summary the cache functions were substantially faster than plain Python for cases with a high number of repetitions of the factorial function, and for this example were about 4 times faster than the Numba function for 10,000 iterations. When the function was only called once the cache functions were significantly slower.

Examples with more practical applications will be examined in later posts.

Posted in Excel, Link to Python, PyXLL, UDFs | Tagged , , , , , , , | 2 Comments