More on Numba

Following my recent post Making Finite Element Analysis go faster … I have been having a closer look at the options in the Numba just-in-time compiler for improving the performance of Python code.

The Numba docs include a series of short code examples illustrating the main options. I have re-run these with pyxll based interface functions, so the routines can be called from Excel as a user defined function (UDF), and return the execution time for any specified number of iterations.

Typical code for timing a function is:

@xl_func
def time_ident_npj(n):
    x = np.arange(n)
    stime = time.perf_counter()
    y = ident_npj(x)
    return time.perf_counter()-stime

Note that the calls to the time function must be outside the function being timed, since Numba does not support the time function.

The fist example from the Numpy docs compared evaluation of an array function using Numpy arrays and Python loops, with or without Numba:

@njit
def ident_np(x):
    return np.cos(x) ** 2 + np.sin(x) ** 2

@njit
def ident_loops(x):
    r = np.empty_like(x)
    n = len(x)
    for i in range(n):
        r[i] = np.cos(x[i]) ** 2 + np.sin(x[i]) ** 2
    return r

Results from these functions are shown below, with times as reported in the Numba article, and as found with my code:

Using Numpy arrays, the Numba function was only slightly faster for me, and was slightly slower as shown in the Numba article. This is not surprising since the Python code had only a single call to the Numpy function, which is already C compiled code, so there is little scope for improving performance.

The function using Python loops was very much slower, and my results were slightly slower than the time in the Numba article. Presumably this is related to using different versions of Python. Adding the Numba decorator reduced the execution time for my code by a factor of 170, and the result was slightly faster than the Numpy function with Numba decorator.

The next examples looks at the effect of the Numba “fastmath = True” and “parallel – True” options:

@njit(fastmath=False)
def do_sum(A):
    acc = 0.
    # without fastmath, this loop must accumulate in strict order
    for x in A:
        acc += np.sqrt(x)
    return acc

@njit(fastmath=True)
def do_sum_fast(A):
    acc = 0.
    # with fastmath, the reduction can be vectorized as floating point
    # reassociation is permitted.
    for x in A:
        acc += np.sqrt(x)
    return acc
@njit(parallel=True)
def do_sum_parallel(A):
    # each thread can accumulate its own partial sum, and then a cross
    # thread reduction is performed to obtain the result to return
    n = len(A)
    acc = 0.
    for i in prange(n):
        acc += np.sqrt(A[i])
    return acc

@njit(parallel=True, fastmath=True)
def do_sum_parallel_fast(A):
    n = len(A)
    acc = 0.
    for i in prange(n):
        acc += np.sqrt(A[i])
    return acc

Results for these functions were:

For this case the Numba compiled code was over 300 times faster than plain Python. The “fastmath = True” option was only of limited benefit in my case, although the Numba article results show a speed up of more than two times. Setting “parallel = True” increased performance by more than 10 times, with “fastmath = True ” again only providing a small further gain. With both options applied, the Numba compiled code was almost 4000 times faster than the plain Python for this case.

This raises the question as to why with more complex code the speed gain from using Numba is often much smaller. This will be examined in more detail in a later post, but the main reason is that if Numba is set to revert to Python mode if there is code it cannot compile (nopython = False), then the resulting code can easily be almost all Python based. The same effect is found using the alternative @jit or @njit decorators. The @njit decorator result in all the Python code being compiled, and will raise an error if any of the code cannot be compiled by Numba. The alternative @jit decorator will switch to Python mode if any code cannot be compiled, but with much reduced (if any) speed improvement. Examples from my own code that will raise an error with @njit, or will not be fully compiled with @jit include:

  • Use of the time function
  • Checking the data type of a variable, such as: if type(x) == tuple: …
  • Statements such as “StartSlope = EndSlope”, where EndSlope has not yet been defined at compile time.
Posted in Excel, Link to Python, NumPy and SciPy, PyXLL, UDFs | Tagged , , , , , , | Leave a comment

Concrete 2021

Edit 12th August: Extended early-bird registrations are now finally closing tomorrow, Friday 13th August!

Concrete 2021 (the 30th Biennial Conference of the Concrete Institute of Australia) is just around the corner, and this year will be all on-line:

https://ciaconference.com.au/

The virtual conference format not only provides greatly reduced travel and accommodation costs, but also all presentations will be downloadable on demand for 30days after the conference. Early-bird registrations are available to midnight Saturday 31st July, East-Australia time (2:00 pm GMT), so click the link above for more information and to secure low-cost tickets for the four day conference.

https://1drv.ms/p/s!Aq0NeYoemF0ni95zzf9KTTIyYrbIIQ?e=uWe3Ar
Posted in Concrete, Newton | Tagged , , | Leave a comment

Copying charts to a new workbook

When an Excel worksheet including a chart(s) is copied to a new workbook the chart still links to the data ranges in the original workbook. Over the years I have spent a fair bit of time editing the chart ranges to their intended location in the new workbook, but recently I decided to check if there is a better way, and found:

Copy Chart to New Sheet and Link to Data on New Sheet at Jon Peltier’s blog

The procedure is:

  • Right click on the tab of the worksheet to be copied
  • Select “Move or Copy …” then in the dialog box select “New book” under “To book”, and select the “Create a copy” check box
  • Click OK
  • The worksheet will be copied to a new file (called book1.xlsx), including any charts, with the charts linking to the data in the new file.
  • Save the new file with a new name, remembering to save as .xlsb or .xlsm if you want to add any VBA code.

Note that the worksheet may also be copied to an existing file, or to a new position in its current file. Also it is possible to copy more than 1 worksheet by selecting the sheets to be copied before right-clicking on one of the selected tabs.

Copying the worksheet:

The resulting new workbook, with chart linked to the data in the new file:

Posted in Charts, Charts, Excel | Tagged , , | Leave a comment

A Numpy trap – correction

In my post of 30th May this year (here) I said that:

As a check that the functions were working correctly, the Python functions were modified to return the sum of the largest array in the first row, revealing that the Numpy code was returning the wrong results!  …

It seems that the Numpy arange function uses 32 bit integers, even if the range extends outside the 32 bit range! 

That’s not quite right. In fact the range of Numpy 32 bit integers is -2,147,483,648 to 2,147,483,647 (which is the same as VBA Longs), and the largest value generated by the quoted arange function was only 100,000. The code generating an incorrect result (without generating an error message) was:

@xl_func
def numpysumn(k):
    a = np.arange(k)
    return a.sum()

With a k value of 100,000 the Numpy arange function generates a sequence from 1 to 99,000, so there is no problem generating the array, but the sum of the array members is 4,995,000,000, which exceeds the 32 bit integer limit.

Alternative solutions to this problem are:

  • Declare the arange function as a 64 bit integer:
    a = np.arange(k, dtype=np.int64)
  • Declare k as a 64 bit integer:
    k = np.int64(k)

In both cases the array a will have a 64 bit integer datatype, and a.sum will return the correct result.

Posted in Arrays, Excel, Link to Python, NumPy and SciPy, PyXLL, UDFs | Tagged , , , , , | Leave a comment

On long and short formulas and VBA

A recent thread at Eng-Tips asked the following apparently simple question:

Starting with a string consisting of numbers with a single central group of letters, how can this string be truncated at the end of the letters, so that for instance 3L481 becomes 3L?

This prompted a lengthy discussion, with many twists and turns, and even some useful answers to the original question.

The original question was looking for an on-sheet solution, rather than VBA, and the first working formula is shown below (all 420 characters of it):

A much more practical alternative (in my opinion) is a VBA user defined function (UDF), which requires just the one argument of the cell address. Two examples are shown below:

Public Function LeftToString(InpStr As String) As String
'
'  Removes trailing digits from a string that comprises
'  a mixture of UPPERCASE letters and digits.
'
'  Elaborations might be required for strings that contain:
'      Lower case letters
'      Only letters
'      Characters that are neither digits not letters.
'
Dim L As Long           'Length of input string
Dim i As Long           'General purpose integer

L = Len(InpStr)
If L <= 0 Then
    LeftToString = ""
    Exit Function
End If
'
'  Loop backwards from the input string's end until hit an uppercase letter.
'
For i = L To 1 Step -1
   If UCase(Mid(InpStr, i, 1)) >= "A" And UCase(Mid(InpStr, i, 1)) <= "Z" Then
        LeftToString = Left(InpStr, i)
        Exit Function
    End If
Next i
'
' Input string contains no letters.
'
LeftToString = InpStr
End Function

And a shorter version:

Function LeftToString2(X As String) As String
Dim i As Long

For i = Len(X) To 1 Step -1
    If (InStr("0123456789", Mid(X, i, 1))) = 0 Then
        LeftToString2 = Left(X, i)
        Exit Function
    End If
Next i
End Function

Results of the two UDFs are shown in the screen-shot above. Note that the first UDF checks for upper-case text, between A and Z, so the extracted character must be converted to upper case. The second avoids the problem by checking if the character is a number.

A much shorter on-sheet formula was then supplied:

Even with the shorter formula, it is not immediately obvious how it works, so I have split it up into its constituent parts. With the input string in Cell B30:

  • =RIGHT(B30,ROW(INDIRECT(“1:”&LEN(B30)))) returns an array of progressively longer strings, starting from the right hand end.
  • =VALUE() converts each member of that array either into a a number or #VALUE!.
  • =ISNUMBER() returns TRUE or FALSE for each of those values.
  • =SUM(–array) or =SUM(array*1) returns the number of TRUE values. The — or *1 operators are required for the SUM function to treat TRUE as a value of 1, otherwise the SUM will always return 0
  • Finally =LEFT(B30, LEN(B30)-K30) extracts the string up to the last text character.

In the latest version of Excel with “dynamic arrays” the second function will work when entered with the enter key, as usual. In older versions it must be entered as an array function, by pressing Ctrl-Shift-Enter.

Even with the second simpler formula, to my mind the VBA function is the better alternative, both in terms of application in a new spreadsheet, and understanding how the thing works. For those not familiar with VBA, the process of creating a new function is quite simple:

  • Open a new spreadsheet (or an existing one you want to use the function in) and save with a chosen name.
  • Press Alt-F11 to open the VBA editor.
  • Right-click on VBAProject(spreadsheet name) in the list of open files on the left, and Insert-Module (see screenshot below)
  • Copy the VBA code and paste it in the new module.

Finally I recommend having a look at the full EngTips thread linked above for discussion on a variety of topics, including whether to declare integer variables as Integer or as Long, how the first long formula works, the quick way to review long formulas, without splitting them up into parts, and VBA code for highlighting the active cell.

Posted in Excel, UDFs, VBA | Tagged , , , , , | Leave a comment