xlwings – dataframes and statistics

Another data conversion option offered by the latest xlwings release is Pandas dataframes and dataseries.  This post will look at how to pass an Excel range as a dataframe or dataseries in a User Defined Function (UDF), and some simple statistical applications, but this is barely scratching the surface of the potential of the Pandas Python Data Analysis Library.

To use the Pandas data structures, import pandas and xlwings; the data type can then be defined with an xw.arg decorator:

import pandas as pd
import xlwings as xw 

@xw.arg('datrange', pd.DataFrame, index = 0, header = 0)    
def rtnpdframe(datrange):
    return datrange
@xw.arg('datrange', pd.Series, header = 0)    
def rtnpdseries(datrange):
    return datrange

The Python functions can then be called from VBA:

Function GetDFrame(DRange As Variant)
On Error GoTo rtnerr:
    GetDFrame = Py.CallUDF(ModName, "rtnpdframe", Array(DRange), ThisWorkbook)
    Exit Function
    GetDFrame = Err.Description
End Function

Function GetDSeries(DRange As Variant)
On Error GoTo rtnerr:
    GetDSeries = Py.CallUDF(ModName, "rtnpdseries", Array(DRange), ThisWorkbook)
    Exit Function
    GetDSeries = Err.Description
End Function

Use of these functions is shown in the screenshots below:


The xl_Corr UDF shown above is taken from the xlwings documentation:

@xw.arg('x', pd.DataFrame, index=False, header=False)
@xw.ret(index=False, header=False)    
def xl_Correl(x):
    return x.corr()

I have added a simple extension to allow an Excel UDF to call any of the dataframe methods:

@xw.arg('x', pd.DataFrame, index=False, header=False)
@xw.ret(index=False, header=False)    
def xl_Stats(x, stat):
    method = getattr(x, stat)
    return method()

Use of this UDF with four different statistics functions is shown below.  In these examples the statistic is a single value, and the results for each of the three columns of the input range are returned as a single column with three rows.


The “describe” method returns 8 values for each column in the input range:


The Pandas documentation lists 20 common statistics functions available as dataframe methods.  The screenshot below shows the use of the xl_Stats UDF, in conjunction with the Index function to call any of these 20 functions.  In the example shown the cumsum function returns one row for each row of the input data (only the first three are shown):


The functions shown above have been added to:
available for free download, with full open source code.


This entry was posted in Excel, Link to Python, Maths, Newton, Python Pandas, UDFs, VBA and tagged , , , , , , , . Bookmark the permalink.

4 Responses to xlwings – dataframes and statistics

  1. Pingback: xlwSciPy 1.7 | Newton Excel Bach, not (just) an Excel Blog

  2. skip says:

    Hello, thank you very much. Its very helpful. I have done this cumsum UDF.

    import pandas as pd
    @xw.arg(‘x’, pd.DataFrame, index=False, header=False)
    @xw.ret(index=False, header=False)
    def xl_Correl_1(x):
    return x.cumsum()

    Do you think that there is any chance to modify this function to make calculation with numbers with many decimal places, using mpmath, or decimal?
    Suppose I will need to do cumsum on numbers like this:



    • dougaj4 says:

      As with your other query, you would have to enter the numbers in Excel as text strings, and transfer them to Python as text, then convert them to a suitable precision number format for the calculations, then back to a string for transfer back to Excel.
      I am not sure if Pandas dataframes have any restrictions on the number format. I’d suggest checking that before spending too much time on it.


      • Skip says:

        I see, this one works in python

        # Import pandas package
        import pandas as pd

        # import decimal library
        from decimal import *

        # set the number of decima places
        getcontext().prec = 30

        # create df from dictionary
        df = pd.DataFrame({‘num’: [2, 2, 2, 2, 2, 2, 2, 2]})

        #apply decimals on one column
        df[‘num’] = df[‘num’].apply(Decimal)

        # print results


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.