I recently discovered that Microsoft are working on Excel functions to work with Regular Expressions. A beta version is currently available to those on the “insider” program, but this functionality has long been available through VBA, and can also be accessed from Python, using pyxll.
Details of the new Excel functions are at:
https://insider.microsoft365.com/en-us/blog/new-regular-expression-regex-functions-in-excel
I have previously covered the VBA use of regular expressions in detail at:
https://newtonexcelbach.com/2014/11/30/extracting-numbers-from-text-and-regular-expressions/
https://newtonexcelbach.com/2014/12/08/more-on-regular-expressions/
https://newtonexcelbach.com/2017/03/11/extracting-numbers-from-text-update/
Detailed documentation of the Python code for regular expressions is at:
From the Analyst Cave tutorial I was able to very quickly write four Excel user defined functions using pyxll to access the Python code:
import re
@xl_func()
@xl_arg('regexp', 'str')
@xl_arg('txt', 'str')
def RESearch(regexp, txt, rtn = 3):
"""
Returns a Match object if there is a match for regexp anywhere in txt
:param regexp: Regular expression
:param txt: Text to be searched
:param rtn: Value to be returned; 1 = span array for match location, 2 = input txt, 3 = list of matches (default)
"""
x = re.search(regexp, txt)
try:
# span() string group()
if rtn == 1:
return x.span()
elif rtn == 2:
return x.string
else:
return x.group()
except:
return "No match"
@xl_func()
@xl_arg('regexp', 'str')
@xl_arg('txt', 'str')
def REFindall(regexp, txt):
"""
Returns a list containing all matches
:param regexp: Regular expression
:param txt: Text to be searched
"""
x = re.findall(regexp, txt)
try:
return x
except:
return "No match"
@xl_func()
@xl_arg('regexp', 'str')
@xl_arg('txt', 'str')
def RESplit(regexp, txt):
"""
Returns a list where the string has been split at all matches
:param regexp: Regular expression
:param txt: Text to be searched
"""
x = re.split(regexp, txt)
try:
return x
except:
return "No match"
@xl_func()
@xl_arg('regexp', 'str')
@xl_arg('replace', 'str')
@xl_arg('txt', 'str')
def RESub(regexp, replace, txt):
"""
Replaces one or many matches with a string
:param regexp: Regular expression
:param txt: Text to be searched
"""
x = re.sub(regexp, replace, txt)
try:
return x
except:
return "No match"
Example results are shown below, using an example from the Microsoft site:
