A Numpy trap – correction | Newton Excel Bach, not (just) an Excel Blog

A Numpy trap – correction

Posted on 20/07/2021 by dougaj4

In my post of 30th May this year (here) I said that:

As a check that the functions were working correctly, the Python functions were modified to return the sum of the largest array in the first row, revealing that the Numpy code was returning the wrong results! …
It seems that the Numpy arange function uses 32 bit integers, even if the range extends outside the 32 bit range!

That’s not quite right. In fact the range of Numpy 32 bit integers is -2,147,483,648 to 2,147,483,647 (which is the same as VBA Longs), and the largest value generated by the quoted arange function was only 100,000. The code generating an incorrect result (without generating an error message) was:

@xl_func
def numpysumn(k):
    a = np.arange(k)
    return a.sum()

With a k value of 100,000 the Numpy arange function generates a sequence from 1 to 99,000, so there is no problem generating the array, but the sum of the array members is 4,995,000,000, which exceeds the 32 bit integer limit.

Alternative solutions to this problem are:

Declare the arange function as a 64 bit integer:
a = np.arange(k, dtype=np.int64)
Declare k as a 64 bit integer:
k = np.int64(k)

In both cases the array a will have a 64 bit integer datatype, and a.sum will return the correct result.

This entry was posted in Arrays, Excel, Link to Python, NumPy and SciPy, PyXLL, UDFs and tagged arange data limit, int32, int64, Numpy, Numpy bug, Python. Bookmark the permalink.

Leave a comment Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.