Python for VBA users – 5; Using built in numpy functions

Previous Python Post

In previous posts in this series I have looked at translating VBA functions to solve quadratic and cubic equations, but the Python numpy library has a polyroots function that will solve polynomials of any degree, and will also handle polynomials with complex coefficients.

Full open source code for all the functions described in this post, as well as the py_Quadratic and py_Cubic functions, may be downloaded from py_polynomial.zip. The download file also includes the spreadsheet py_Polynomial.xlsb, including the examples illustrated below, and VBA based polynomial functions. Note that once the py_polynomial.py module has been installed the functions may be called from any Excel worksheet, including .xlsx files that have VBA disabled.

To install the Python functions:

  • Install Python with the Numpy add-in.
  • Install PyXll
  • Add: polynomial.py to the “modules =” section of the PyXll pyxll.cfg file (example included in the download file)

See: Installing Python, Scipy and Pyxll for more details.  Details of the Python code for the functions are given below, but all this is included in the polynomial.py module, and once that has been installed no further coding is required.  All the functions will be available in the same way as the built-in Excel functions.

To call the polyroots (and other polynomial functions) from any Python function the following line must be added to the Python code module:

import numpy.polynomial.polynomial as poly

The polyroots function can then be called with the following one-liner (two-liner, including the Excel decorator):

@xl_func("numpy_column CoeffA: numpy_column")
def py_Polyshort(CoeffA): return poly.polyroots(CoeffA)

The functionality can be considerably improved with a little more work however:

  • Add “Inc_power” and “Row_out” options, so that coefficients may be listed in either ascending or descending powers of x, and output arrays may be in either row or column format.
  • Specify “numpy_array” rather than “numpy_column” as the input and output data types, so that the data may be arranged in row or column format, and complex numbers may be input and output as pairs of values in adjacent cells.
  • Add “doc strings” that will appear in the function dialogue box, and “category” and “help_topic” items.
@xl_func("numpy_array CoeffA, bool Inc_power, bool Row_out: numpy_array", category="py-Maths", help_topic="http://docs.scipy.org/doc/numpy/reference/routines.polynomials.polynomial.html!0")
def py_Polyroots(CoeffA, Inc_power, Row_out):
    """
    Find the real and complex roots of a polynomial equation: a x^n + b x^(n-1) ... + y x + z = 0
    CoeffA: Row or column of function coefficients
    Inc_power: Optional, default False = coefficents listed in order of descending powers of x
    Row_out: Optional, default False = roots output in two columns (real and imaginary parts of each root)
    """

The screen shot below shows the “Insert Function” Dialogue for the py_Polyroots function, showing the function description, and help for each function argument, as defined in the Python doc string:
polyroots0

To deal with the Inc_power and Row_out options, and to deal with output of complex numbers as a pair of floats, the following operations are then required:

  • Check the orientation of the input array of coefficients (CoeffA), and transpose to column format if necessary.
  • Create an output array, with two columns x (number of roots + 1)
  • The numpy polyroots function requires a 1D array with coefficients listed in ascending powers of x.  Extract the first column of CoeffA, and if Inc_power is False reverse the order of the coefficients.  Note that this operation, including reversing the order of the coefficients, can be accomplished with a single Python “list comprehension”:
    CoeffA[::-1,0]
    or without the reversal of the order:
    CoeffA[:,0]
  • Convert complex results to a pair of floats, and count the number of complex roots.
  • Write the number of real and complex roots to the end of the output array.
  • Return the results as a row or column, depending on the value of Row_out.

The final code is shown below, followed by example output for a fifth order polynomial, with different arrangements of input and output.

@xl_func("numpy_array CoeffA, bool Inc_power, bool Row_out: numpy_array", category="py-Maths", help_topic="http://docs.scipy.org/doc/numpy/reference/routines.polynomials.polynomial.html!0")
def py_Polyroots(CoeffA, Inc_power, Row_out):
    """
    Find the real and complex roots of a polynomial equation: a x^n + b x^(n-1) ... + y x + z = 0
    CoeffA: Row or column of function coefficients
    Inc_power: Optional, default False = coefficents listed in order of descending powers of x
    Row_out: Optional, default False = roots output in two columns (real and imaginary parts of each root)
    """
# Transpose CoeffA to column format if necessary
    if CoeffA.shape[0] == 1: CoeffA = transpose(CoeffA)
# Create output array; two columns x (number of roots + 1)
    nroots = CoeffA.shape[0]-1
    resa = zeros((nroots+1,2))
# polyroots requires a 1D array with coefficients listed in ascending powers of x
# Extract the first column of CoeffA, and if Inc_power is False reverse the order of the coefficients
    if Inc_power == False:
        res = poly.polyroots(CoeffA[::-1,0])
    else:
        res = poly.polyroots(CoeffA[:,0])
# Convert complex results to a pair of floats, and count the number of complex roots
    numi = 0
    for i in range(0,nroots):
        resa[i,0] = res[i].real
        resa[i,1] = res[i].imag
        if resa[i,1] != 0: numi = numi+1
# Write the number of real and complex roots to the end of resa
    i = i+1
    resa[i,0] = nroots-numi
    resa[i,1] = numi
# Return the results as a row or column, depending on the value of Row_out
    if Row_out == False :
        return resa
    return transpose(resa)

Row input and column output

polyroots1

Column input and output and polyshort function

polyroots2a

Results for a 60th order polynomial.  The results in columns C and D are from the VBA rpolyjt() function.  The results from the two functions are sorted in different orders, but are in good agreement (see the spreadsheet for full results list)

polyroots3

The py_PolyrootsC function will accept complex coefficients of x.  The function converts each pair of values to a Python complex number, then calls the py_Polyroots function

polyroots4

The py_PolyfromRoots function generates a monic polynomial from the input roots, which may be real or complex.  The example illustrated shows the use of the Inc_power and Row_out options to generate output with ascending powers of x in row format.

polyroots5

py_PolyfromRoots function with complex roots.

polyroots6

The results generated by py_Polyroots and py_PolyrootsC have been checked using the py_Polyval function. This evaluates a polynomial defined by a series of coefficients for a specified value of x. X may be a real value defined by a single cell, or a complex value defined by two adjacent cells. As for the other functions the coefficients may be listed in descending powers of x (default), or ascending order.

polyroots7

Posted in Arrays, Excel, Link to Python, Maths, Newton, NumPy and SciPy, UDFs, VBA | Tagged , , , , , , , , , , , | 5 Comments

Python for VBA users – 4; Python data types

Previous Python Post

The basic range of data types in Python is quite simple, but by the time we have added additional types used in the numpy package, and translation of Excel data types into the Python equivalent the picture gets considerably more complicated, and you can find yourself wasting significant amounts of time chasing down errors due to using incompatible data types.  This post will look at the main data types, including numpy arrays, with examples of their use in PyXll functions.

The Python code and associated Excel spreadsheet can be downloaded from pyTypes.zip.  To use the Python functions it is necessary to install the PyXll add-in, then either add “pyTypes” under modules in the pyxll.cfg file, or simply copy and paste all the code from pyTypes.py into the worksheetfuncs.py file, to be found in the PyXll\examples folder.

The PyXll manual lists the following PyXll data types, with their Python equivalent:

PyXll and Python Data Types

PyXll and Python Data Types

Note that the unicode type is only available from Excel 2007.

An Excel range of any of the above data types can be passed as a 2D array, which is treated in Python as a “list of lists”.  In addition the PyXll var type can be used for any of the above types, or a range.  A further option for transferring an Excel range  of numerical data is use of the numpy_array, numpy_row and numpy_column types.  Further options (which we will examine in a later post) are Custom Types and the xl_cell type.

The function var_pyxll_function_1 shown below (taken from the PyXll worksheetfuncs.py file) returns the name of the data type passed to the function. The var_pyxll_function_1a function simply returns the data back to the spreadsheet.

@xl_func("var x: string")
def var_pyxll_function_1(x):
    """takes an float, bool, string, None or array"""
    # we'll return the type of the object passed to us, pyxll
    # will then convert that to a string when it's returned to
    # excel.
    return type(x)

@xl_func("var x: var")
def var_pyxll_function_1a(x):
    return x

Examples of these functions are shown in the screenshot below:

Python basic data types

Python basic data types

The var_pyxll_function_1 function passes data to Python as a var, and returns a string with the Python data type name.  Note that:

  • All numbers (including integers) become Python floats.
  • Empty cells become Python NoneType
  • Strings (including complex number strings) become Python Unicode (in Excel 2007 and later).
  • Boolean values become Python bool.
  • Excel error values become different types of Python exceptions.
  • Both 1D and 2D Excel ranges become Python lists.  In fact in both cases the range becomes a “list of lists”.

The var_pyxll_function_1a function passes data to Python as a var, and simply returns the same data, without modification:

  • All data types other than a blank, including 1D and 2D ranges and error values, are passed back unchanged, however blank cells are passed back as a zero.

Further details of the handling of data are examined in four variants of a function converting an Excel complex number string to a pair of floating point values in adjacent cells:

@xl_func("var x: var")
def py_Complex1(x):
    """
    Convert a complex number string to a pair of floats.
    x:  Complex number string or float.
    """
    if isinstance(x, float) == False:
        if x[-1] == "i":
            x = x.replace("i", "j",1)
    c = complex(x)
    clist = list([[0.,0.]])
    clist[0][0] = c.real
    clist[0][1] = c.imag
    return clist

The code first checks if the function argument is a float.  If not it checks if the last character is an “i” (the default Excel imaginary number indicator), and if so changes it to a “j” (the required character for a Python complex number).  The code then:

  • Converts the Excel complex number string to a Python complex number.
  • Creates a python “list of lists” consisting of a single list containing two floats: [0., 0.]
  • Allocates the real and imaginary parts of the complex number to the list.  Note that the index values are always base zero, and that two index values are required; the first indicating list index 0 (i.e. the first and only list in the list of lists) and the second indicating the index value in the selected list.
  • The resulting list is then returned with the statement: return clist

Note the use of text within triple quotes at the top of the function.  This text will appear in the “Excel function wizard”, including the description of each function argument.  This feature will be examined in more detail in a later post.

The results of this function are shown below:

Results from py_Complex1 and py_Complex2

Results from py_Complex1 and py_Complex2

Note that the complex number string in cell A27 is returned as a pair of values, and the other cells with numerical values are correctly returned with the cell value as the real part, and an imaginary value of zero.  The blank and Boolean values return a #VALUE! error, as does the string in cell A31, because it has spaces around the “+”, which are not allowed in a complex number string.

The function py_Complex2 is similar, except that the return type has been changed from a var to a numpy_row:

@xl_func("var x: numpy_row")
def py_Complex2(x):
    """
    Convert a complex number string to a pair of floats.
    x:  Complex number string or float.
    """
    crow = zeros(2)
    if isinstance(x, float) == False:
        if x[-1] == "i":
            x = x.replace("i", "j",1)
    c = complex(x)
    crow[0] = c.real
    crow[1] = c.imag
    return crow

A numpy_row is a 1D array created with the statement: crow = zeros(x), where x is the number of elements in the array.  Since it is a 1D array each element is specified with a single index, e.g. crow[0] = c.real.

The results for py_Complex2 are the same as py_Complex1, except that invalid input returns a #NUM! error, rather than #VALUE!

In py_Complex3 a check has been added to pick up invalid input data, so a more useful error message can be returned:

@xl_func("var x: var")
def py_Complex3(x):
    """
    Convert a complex number string to a pair of floats.
    x:  Complex number string or float.
    """
    try:
        crow = zeros(2)
        if isinstance(x, float) == False:
            if x[-1] == "i":
                x = x.replace("i", "j",1)
        c = complex(x)
        crow[0] = c.real
        crow[1] = c.imag
        return array([crow])
    except:
        return "Invalid input type"

In this case the return array has been created as a 1D numpy-row, using crow = zeros(2), but the return value may also be the string “Invalid input type”.  To handle these alternative data types the return value must be specified as a var, and the numpy_row is converted into a 2D array with: return array([crow]).

Note the use of the Python “try:, except:” keywords to handle the exception raised when the statement: c = complex(x) is applied to an x that is not a valid complex number string or a float.

py_Complex3 and py_Complex4

py_Complex3 and py_Complex4

In py_Complex4 the check of the input data type has been amended to first check for a float or a Boolean (in which case the value is returned), then check for a string.  A blank cell does not satisfy any of these conditions, and returns a value of zero.  The numpy_row return type has been used, which precludes returning a string, but this could be changed to a var, as in py_Complex3, to allow an error message to be returned for invalid string input.

@xl_func("var x: numpy_row")
def py_Complex4(x):
    """
    Convert a complex number string to a pair of floats.
    x:  Complex number string or float.
    """
    crow = zeros(2)
    if isinstance(x, float) or isinstance(x, bool):
        crow[0] = x
    elif isinstance(x, unicode):
        if x[-1] == "i":
            x = x.replace("i", "j",1)
        c = complex(x)
        crow[0] = c.real
        crow[1] = c.imag
    else:
        crow[0] = 0
    return crow

The use of different types of array is illustrated with three variants of a function to sum the values in each row of a 2D range, and return the results as a column array.

@xl_func("float[] x: float[]")
def SumRows1(x):
    """
    Sum rows of selected range.
    x:  Range to sum.
    """
    nrows = len(x)
    i = 0
    suml = [0.] * nrows
    for row in x:
       for j in range(0,3):
            suml[i] = suml[i] + row[j]
       i = i+1
    return transpose([suml])

@xl_func("var x: var")
def SumRows2(x):
    """
    Sum rows of selected range.
    x:  Range to sum.
    """
    nrows = len(x)
    i = 0
    suml = [0.] * nrows
    for row in x:
       for j in range(0,3):
            suml[i] = suml[i] + row[j]
       i = i+1
    return transpose([suml])

@xl_func("numpy_array x: numpy_column")
def SumRows3(x):
    """
    Sum rows of selected range.
    x:  Range to sum.
    """
    nrows = len(x)
    i = 0
    suml = zeros(nrows)
    for row in x:
       for col in row:
            suml[i] = suml[i] + col
       i = i+1
    return suml

This code uses two variants of a for loop:

  • for row in x: loops three times to return a list of 3 values for each of the three rows in x
  • for j in range(0, 3): loops three times returning the values 0, 1, 2
  • the j value is then used as the index in row[j] to return each value in each row

The row sum values are assigned to the list suml, which is finally returned as a 2D array with 3 rows and 1 column with the line: return transpose([suml])

SumRows1 and SumRows2 are similar, except that SumRows2 uses vars for both input and output rather than float arrays (float[]).

In SumRows3 the input is a numpy_array, and output a numpy_column.  suml is created as a 1D numpy array, using the zeros function, and because the return type has been declared as a numpy_column this can be returned without transposing.

The screenshot below shows that when the input array has numbers in every cell them the three functions produce identical results:

SumRows functions

SumRows functions

However if one more cells are blank SumRows2 returns an error, whreas SumRows1 and SumRows2 allocate a zero to the blank cell, and return the correct results.

Finally the convenience of use of the numpy routines is illustrated with a function to solve systems of linear equations with a single line of Python code:

@xl_func("numpy_array a, numpy_column x: numpy_column")
def py_solve(a, x):
	"""
        Solve a matrix equation ax = y.
	a: square matrix
	x: column x vector
	"""
	return solve(a, x)
py_solve

py_solve

Posted in Arrays, Excel, Link to Python, NumPy and SciPy, UDFs, VBA | Tagged , , , , , , , | 5 Comments

Python for VBA users – 3

In the previous post in this series I looked at translating a routine to solve quadratic equations from VBA to Python.  Using the same process I have also created Python code to solve Cubic equations.  The only added difficulty for the Cubic routine was that the VBA function included a call to a VBA function to sort the results in ascending order.  It would have been quite easy to translate the VBA routine, but since Python already includes a sort routine it was easier to use this, so the VBA function call:

BubbleSort CubicA, 1, 3

became in Python:

CubicA[0:3,0:1] = sort(CubicA[0:3,0:1],0)

Note that this is the numpy sort routine, operating on the first 3 rows of a numpy 2D array.

Having generated the code that was returning numerical results in the desired format, it was just necessary to check that the results were correct, however a quick check soon showed that they were not.

For the Quadratic routine a quick check through the code soon found the source of the error.  I had converted the VBA:

If (Abs(B) >= Abs(C)) Then
            E = 1# - (A / B) * (C / B)
            D = Sqr(Abs(E)) * Abs(B)
        Else
            E = A
            If (C < 0#) Then E = -A
            E = B * (B / Abs(C)) - E
            D = Sqr(Abs(E)) * Sqr(Abs(C))
        End If

To Python

     if (abs(B) >= abs(C)):
            E = 1. - (A / B) * (C / B)
            D = ((abs(E)) * abs(B))**0.5
        else:
            E = A
            if (C < 0.):
                 E = -A
             E = B * (B / abs(C)) - E
             D = ((abs(E)) * (abs(C))**0.5)**0.5

The Python math module has a sqrt function, so it would have been simpler to use that, but using the equivalent **0.5, it was just necessary to correct the placement of brackets and exponents:

     if (abs(B) >= abs(C)):
            E = 1. - (A / B) * (C / B)
            D = ((abs(E))**0.5) * abs(B)
        else:
            E = A
            if (C < 0.):
                E = -A
            E = B * (B / abs(C)) - E
            D = ((abs(E))**0.5) * (abs(C))**0.5

The problem with the cubic code was not so obvious, but some experimentation revealed that the problem was occurring when there were one real and two complex roots, and was caused because:

A_ = -sgnR * (abs(R_) + (R2 - Q3) ** 0.5) ** (1 / 3)

was always evaluating to 1 or -1, i.e. to -sgnR.  The problem was due to the fact that in Python version 2 division of two integers always returns an integer, so 1/3 = 0.  To get a decimal result in Python 2 at least one of the two values must be non-integer, so 1./3 = 0.3333333333333333, and yields the correct result.

In this case changing the one instance of 1/3 to 1./3 was sufficient to fix the problem, but a better solution can be achieved by adding:

from __future__ import division

to the top of the code module.  This line causes Python Version 2 to use the Version 3 rules for division, which yields a decimal result when two integers are divide, rather than an integer.

Results from the two functions are shown in the screenshot below:

py_Quad and py_CubicC results

py_Quad and py_CubicC results

Posted in Excel, Link to Python, Maths, UDFs, VBA | Tagged , , , , , , | 2 Comments

Unchained Unchained Melodies and Unchained Guitars

The Telescoper has a link to a short YouTube clip of the original and excellent version of The Unchained Melody (subsequently made famous by the Righteous Brothers), which led me to a slightly less conventional version:

And at the suggested links at the end of that I was taken to guitarist Pierre Bensusan:

And while I was browsing guitarists, this from The Little Unsaid caught my attention:

Listening to the Pierre Bensusan tracks, I wondered if he had ever played with John Renbourn.  It seems he has, and that Renbourn (and Bert Jansch) were some of his earliest influences:

Interview Double Bill Concert with John Renbourn (2002)

Posted in Bach | Tagged , , , | 1 Comment

Python for VBA users – 2

This post will look in more detail at translating the VBA Quadratic function (described in Python for VBA users – 1) into a Python function, linked to Excel via PyXll.  For the purposes of this exercise I created a new Python module called quadratic2.py, and copied the VBA code into the blank module, using the PyScripter text editor.

After the first pass through of the procedure given in the previous Python post the code looked like this:

@xl_func("numpy_array CoeffA: numpy_array")
def py_Quadratic2(CoeffA):
# Calculate the zeros of the quadratic a*z**2+b1*z+c.
# The quadratic formula, modified to avoid overflow, is used to find the
# larger zero if the zeros are real and both zeros are complex.
# The smaller real zero is found directly from the product of the zeros c/a.
    if CoeffA.shape[0] == 1:
        CoeffA = transpose(CoeffA)
    A = CoeffA[0]
    B1 = CoeffA[1]
    C = CoeffA[2]
    if (A == 0.):
        NR = 1.
        SR = 0.
        if (B1 != 0):
            SR = -C / B1
        LR = 0.
        SI = 0.
        LI = 0.
    elif (C == 0.):
        NR = 1.
        SR = 0.
        LR = -B1 / A
        SI = 0.
        LI = 0.
    else:
        # Compute discriminant avoiding overflow
        B = B1 / 2.
        if (abs(B) >= abs(C)):
            E = 1. - (A / B) * (C / B)
            D = ((abs(E)) * abs(B))**0.5
        else:
            E = A
            if (C < 0.):
                 E = -A
             E = B * (B / abs(C)) - E
             D = ((abs(E)) * (abs(C))**0.5)**0.5
         if (E >= 0.):
            # Real zeros
            NR = 2
            if (B >= 0.):
                D = -D
            LR = (-B + D) / A
            SR = 0.
            if (LR <> 0.):
                SR = (C / LR) / A
            SI = 0.
            LI = 0.
        else:
            # complex conjugate zeros
            SR = -B / A
            LR = SR
            SI = abs(D / A)
            LI = -SI<br />
    if NR == 1:
        QuadA[1, 1] = SR
        QuadA[3, 1] = NR
    else:
        if LR < SR:
            QuadA[1, 1] = LR
            QuadA[2, 1] = SR
            QuadA[3, 1] = NR
            if NR == 0:
                QuadA[1, 2] = LI
                QuadA[2, 2] = SI
                QuadA[3, 2] = 2
        else:
            QuadA[1, 1] = SR
            QuadA[2, 1] = LR
            QuadA[3, 1] = NR
            if NR == 0:
                QuadA[1, 2] = SI
                QuadA[2, 2] = LI
                QuadA[3, 2] = 2
    return QuadA

Note that the function argument is designed to accept a 3 cell range, which may be either a single row or a single column.  The logic for dealing with this has been changed in the Python function, for reasons that will discussed in a later post.  For now just observe that the contents of the three cell input range (CoeffA) are copied into variables A, B1 and C.

After saving this code, and re-loading PyXll in Excel (Using Add-ins, PyXll, Reload PyXll), I entered three quadratic coefficients (a, b and c in ax^2 + bx + c = 0) in adjacent cells, and the function in a blank cell:

=py_Quadratic2(G10:I10)

This returned a Name? message. Before a new module can be used it must be added to the pyxll.cfg file in the PyXll folder.  After adding quadratic2 to the modules list in the cfg file, and re-loading PyXll in Excel I was still getting the Name? message.

The problem this time could be found in the PyXll log file, which showed:

  • 2013-09-28 15:58:00,308 – ERROR : Error importing ‘quadratic2’: name ‘xl_func’ is not defined

The PyXll library must be imported at the top of any code module where it is used.  In addition I would be using maths functions, and also numpy arrays and the transpose function from the numpy library, so the following lines were added to the top of the code module:

from pyxll import xl_func
from math import *
from numpy import *

Saving and re-loading the code, the function now displayed: #NUM! indicating progress, but not quite there yet.

Looking at the log file again, the last message was:

“2013-09-28 16:42:25,936 – ERROR : if NR == 1:
2013-09-28 16:42:25,936 – ERROR : UnboundLocalError: local variable ‘NR’ referenced before assignment.

Looking through the code, you will see that the variable NR is assigned a value 1 if either A or C are equal to zero, but if both are non-zero we arrive at the statement “if NR == 1:” before NR has been assigned any value.  This can be corrected by simply inserting NR = 0 near the top of the code.

Saving and re-loading, the function still shows #NUM!, and the log file now records:

2013-09-28 16:51:23,957 – ERROR : QuadA[1, 1] = SR
2013-09-28 16:51:23,959 – ERROR : NameError: global name ‘QuadA’ is not defined

We need to create the array QuadA before we can assign values to it.  We will look at the Python alternatives for storing arrays (or array-like objects) in a later post; for now we will just create a 3×2 array using the numpy “zeros” function:

QuadA = zeros((3,2))

Note the double brackets required for a 2D array, and that zeros uses () rather than []. Also remember that all Python arrays (and lists, etc.) are zero based, whereas our VBA arrays were dimensioned as base 1.  All the array indices therefore should be reduced by 1:

    if NR == 1:
        QuadA[0, 0] = SR
        QuadA[2, 0] = NR
    else:
        if LR < SR:
            QuadA[0, 0] = LR
            QuadA[1, 0] = SR
            QuadA[2, 0] = NR
            if NR == 0:
                QuadA[0, 1] = LI
                QuadA[1, 1] = SI
                QuadA[2, 1] = 2
...

After making all the required changes, saving the module, and re-loading in Excel, we finally have a function that works:

py_quad

Posted in Excel, Link to Python, NumPy and SciPy, UDFs, VBA | Tagged , , , , , , | 11 Comments