Extracting and counting factors with Python and Numba

This post features code using the pyxll add-in to call Python code from Excel. See pyxll for more details and downlod.

A recent issue of New Scientist had a series of puzzles finishing with: how many integers between 1 and 1 million have the property that 1/3 of their factors are divisible by 3.

This can actually be solved without factorising 1 million numbers, which is the point of the puzzle, but it is an interesting exercise to write the code to do it the brute force way in a reasonable time.

First a function to list all the factors of any input integer:

def py_factors1(n):
    factlist = reduce(list.__add__, ([i, n//i] for i in range(1, int(n**0.5) + 1) if n % i == 0))
    return list(set(factlist))

Then count the total number of factors, and that number that are exactly divisible by an input value:

def py_countfact1(factlist, n):    
    numfact = len(factlist)
    numdiv = 0
    for i in range(0, numfact):
        if factlist[i] % n == 0: numdiv = numdiv+1
    return(numfact, numdiv)

Then call each function over the given range, and count the number that satisfy the required factor ratio.:

@xl_func
@xl_arg('factrange', 'int[]')
@xl_arg('n', 'int')
@xl_arg('ratio', float)
@xl_arg('step', 'int')
def py_countfactrange1(factrange, n, ratio, step = 1):
    stime = time.perf_counter()
    start, stop = factrange[:]
    rtnnum = 0
    for i in range(start, stop, step):
        fact = py_factors1(i)
        countfact = py_countfact1(fact, n)
        if countfact[1]/countfact[0] == ratio: rtnnum = rtnnum + 1
    return rtnnum, time.perf_counter() - stime

The New Scientist puzzle is solved by this code in about 9 seconds, finding that there are no numbers satisfying the required factor ratio:

The solution time can be greatly reduced with the Numba just-in-time compiler. For the second function it is only necessary to add the Numba decorator at the top of the code. I have also added pyxll decorators, so it can be called from Excel:

@xl_func
@xl_arg('factlist', 'int[]')
@xl_arg('n', 'int')
@njit
def py_countfact(factlist, n):    
    numfact = len(factlist)
    numdiv = 0
    for i in range(0, numfact):
        if factlist[i] % n == 0: numdiv = numdiv+1
    return(numfact, numdiv)

The first function is not so simple. Numba does not work with set objects, or the reduce function, so it must be re-written:

@xl_func
@xl_arg('n', 'int')
@njit
def py_factors(n):    
    maxn = int(n**0.5) + 1
    factlist = [1]
    for i in range(2, maxn):
        if n % i == 0:
            if not i in factlist:
                factlist.append(i)
                if i != n//i: factlist.append(n//i)
    factlist.append(n)
    return factlist

The calling function only requires the Numba decorator, and editing to call the new functions:

@xl_func
@xl_arg('factrange', 'int[]')
@xl_arg('n', 'int')
@xl_arg('ratio', float)
@xl_arg('step', 'int')
@jit
def py_countfactrange(factrange, n, ratio, step = 1):
    stime = time.perf_counter()
    start, stop = factrange[:]
    rtnnum = 0
    for i in range(start, stop, step):
        fact = py_factors(i)
        countfact = py_countfact(fact, n)
        if countfact[1]/countfact[0] == ratio: rtnnum = rtnnum + 1
    return rtnnum, time.perf_counter() - stime

The new code reduces the execution time to 0.47 seconds:

The functions can be called from Excel with any desired range, divisor and target ratio:

I have added the functions described above and the associated code to the pyxll examples.xlsx and worksheetfuncs.py files, which can be downloaded from: FactorUDFs.zip

This entry was posted in Excel, Link to Python, Maths, Newton, PyXLL, UDFs and tagged , , , , , , , . Bookmark the permalink.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.