More on Regular Expressions

The previous post presented a User Defined Function (UDF) using regular expressions to extract all numbers, or a selected number, from any string (or a list of strings).  In this post I will look more closely at the code used to achieve this, and present a general purpose UDF allowing regular expressions to be used on strings.  A spreadsheet with examples of both functions, and full open-source code, can be downloaded from RegExpres.xlsb.

First the Regular Expression object must be created and given the appropriate criteria to return all numbers (0-9), the decimal character (“.”), the exponent symbol (“E”), and the plus and minus symbols (“+-“).  It is also set to return all matching strings (Global = True) and to ignore the case of alphabetic characters (IgnoreCase = True):


Const ExpString As String = "E", pmString As String = "+-"
'Initialize variables
Set RE = CreateObject("VBScript.RegExp")
'Criteria for Regular Expression
REPatt = "[0-9" & DecString & ExpString & pmString & "]"
With RE
.Pattern = REPatt
.Global = True
.IgnoreCase = True
End With

There follows some code to convert the input range to a variant array and count the number of rows; the numbers in each string are then extracted with the line:

Set Matches = RE.Execute(NumString)

As discussed in the previous post, for a string containing a single value, that is all that is required, but if there is more than one number all the number strings will be concatenated, and must be separated. This can be done by stepping through the original string and comparing the characters with those in the Matches variable, as shown below:

'Loop Matches collection and compare with each character of original string to build strings of all numbers in the sample string
        strNumber = ""
        NumMatches = Matches.Count
        NumChar = Len(NumString)
        j = 0
        k = 1
        PrevMatch = False
        For i = 1 To NumChar
            MatchString = Mid(NumString, i, 1)
            If PrevMatch = False And UCase(MatchString) = ExpString Then
                j = j + 1
            ElseIf PrevMatch = False And (MatchString = "+" Or MatchString = "-") And _
                   IsNumeric(Mid(NumString, i + 1, 1)) = False Then
                j = j + 1
            ElseIf MatchString <> IgnoreString Then
                Digit = Matches(j)
                If Digit = MatchString Then
                    strNumber = strNumber + Matches(j)
                    j = j + 1
                    PrevMatch = True
                ElseIf PrevMatch = True Then
                    If Position = 0 Then
                        OutA(Row, k) = CDbl(strNumber)
                    ElseIf Position = k Then
                        OutA(Row, 1) = CDbl(strNumber)
                        Exit For
                    End If
                    strNumber = ""
                    k = k + 1
                    PrevMatch = False
                End If
            End If
        Next i
        If PrevMatch = True Then
            If Position = 0 Then
                OutA(Row, k) = CDbl(strNumber)
            ElseIf k = Position Then
                OutA(Row, 1) = CDbl(strNumber)
            End If
        End If
        If k < Position Then OutA(Row, 1) = CVErr(xlErrNA)
    Next Row

Regular Expressions also have many other uses, and the Reg_Exp UDF allows them to be applied quickly and easily. The Regular Expression object has the methods “List”, “Test”, and “Replace”, and the screen-shots below show the use of these methods on a list of strings:

Extract all numbers

RegExp1

Extract all occurrences of the string “Node”

RegExp2

Find strings containing “Node”

RegExp3

Replace “Node” with “Number”

RegExp4

Find all four letter strings starting with “n” and ending with “e”

RegExp5

For more details of setting up regular expression patterns see:
VBA RegEx: How to Evaluate and Use “Regular Expressions”

Finally note that if you want to apply these UDFs to a range they must be entered as an Array Function to return all the results.

Posted in Excel, UDFs, VBA | Tagged , , , , | 10 Comments

Extracting numbers from text and regular expressions

Daily Dose of Excel recently had a post on extracting a number from a specified position in a text string, using mega-in-sheet-formulas like:

ExtractNum2-0

This reminded me of a post from earlier this year, which provided a User Defined Function (UDF) that used regular expressions to extract numbers from a string.  This worked well if the string contained a single number, but returned incorrect results if there were two or more numbers.

I have now updated this function to return correct results from a string (or array of strings) with two or more separate numbers.  The new function (ExtractNums) is compared below with the previous version (ExtractNum), and another function that extracts numbers from the right hand end of a string.  As well as finding all the numbers, the new function will also deal with negative numbers, numbers with a comma separator for thousands, and numbers in E notation.  The new function may be downloaded from RegExpres.xlsb

In the screenshot below the function returns all the numbers from each string, in an array:ExtractNum2-1

It is also possible to specify a particular number from the sequence to return.  The screenshot below shows results for a range of different number formats, compared with two functions posted in the comments to the DDoE article:

ExtractNum2-2

In the screenshot below the third number in the sequence has been requested. When there are less than 3 numbers the function returns #N/A. Note that the Extractnums function recognises 123,456 as a single number, whereas the DDoE functions treat it as two separate numbers.
ExtractNum2-5

For speed of operation, The original ExtractNum function (which does no checking of the string returned by the regular expression function) is about 5 times faster than ExtractNums, so is useful for strings that contain a single numerical string.  The F_snb function is about 30% faster than ExtractNums, but does not recognise comma separators, E Notation, or numbers starting with a minus sign.  The F_snb2 function (which is written as a single line of VBA code), is about 10 times slower.

The next post will have a closer look at the code, and present another function for working with regular expressions.

Posted in Arrays, Excel, UDFs, VBA | Tagged , , , , | 4 Comments

Range tips

A couple of tips on Excel ranges, coming from questions on the Eng-Tips Forum:

1. How to rotate a range:

The Transpose function (or option in the Paste-Special dialog box) will swap the rows and columns of a range, but there is no simple built-in way to rotate a range of data through 90 degrees.  The VBA code below will create a User Defined Function (UDF) that will do the job:

Function RotateRange(RRange As Variant) As Variant
Dim NumRows As Long, NumCols As Long, i As Long, j As Long, RotnA() As Variant

RRange = RRange.Value2
NumRows = UBound(RRange)
NumCols = UBound(RRange, 2)
ReDim RotnA(1 To NumCols, 1 To NumRows)
For i = 1 To NumRows
For j = 1 To NumCols
RotnA(j, i) = RRange(NumRows + 1 - i, j)
Next j
Next i
RotateRange = RotnA
End Function

Copy and paste this code into a VBA module.

To use the function, enter =RotateRange(address of range to be rotated) in the top-left cell of the destination range, then select the entire output range, and enter the function as an array function to display the whole rotated range.

RotateRange

Update 29 Nov 2014:  Further to the comments from Ian Huitson and Lori below, I have uploaded a spreadsheet with examples of the UDF and their suggestions:

RotateRange2

 

2. How to show the address of a named range (or not):

The writer of this question was having a problem in Excel 2007 with named ranges in formulas being converted to cell addresses whenever a cell was edited, either by pressing the F2 key, or by double clicking on the cell.  This was happening in some files, but not others.

The solution to the problem was to go into Options-Advanced, scroll right to the bottom, and de-select the two boxes under “Lotus 123 Compatibility Settings for:”  Note that each sheet of a workbook can have different settings, so it may be necessary to check every sheet.

Further examination found that this option works differently in the more recent Excel versions (2010 and 2013).

For the later versions if you double click in the cell, or press F2 the formula box at the cell displays the cell address, but the formula bar still displays the range name.  If you click in the formula bar both the formula box and the formula bar display the cell address, rather than the range name, but the range name returns when you press enter.  This behaviour is actually quite useful, but it seems to be little known.

Strangely, Lotus 123 itself (at least in Rel. 9.5) does not replicate either of these behaviours.  When you click on a cell, or click on a formula bar, the range names remain as names, just like in Excel, when “Lotus 123 Compatibility” is not selected!

Posted in Excel, UDFs, VBA | Tagged , , , , | 6 Comments

The Gladesville Bridge is 50

The Gladesville Bridge in Sydney is much less well known than the steel arch bridge just up-river, but in many ways the methods used in its design and construction were more innovative, and at the time of its completion it was the longest span concrete arch bridge in the World.

The selection of photographs below are taken from a recently published article by Rob Wheen, who as a young man was a keen observer of the bridge construction.  The full article is freely available for download from: Gladesville Bridge Article, and members of The Concrete Institute of Australia may view the full issue (and previous magazines back to 2008) at Concrete in Australia.

gladesville1

gladesville2

gladesville3

gladesville4

gladesville5

Finally, The Happy Pontist has recently posted a series of articles about bridges in the Avignon region of France, including the well known original Avignon Bridge (Pont Saint-Bénézet), which was constructed in parallel segments, pre-dating the use of this technique in the Gladesville Bridge by some 700 years:

Posted in Arch structures, Historic Bridges, Newton | Tagged , | Leave a comment

Durability of Concrete Structures

Don’t put yourself or your concrete assets at risk.

Designing and constructing for durability is one of the most important aspects of any construction program. However, requirements for concrete durability are extremely fragmented in Australia. These conflicting requirements put designers, contractors, and asset owners at risk, as there a number of different approaches available that may or may not be appropriate for the specific structure.

In recent times the Concrete Institute of Australia Durability Committee, including some of Australia’s leading durability experts, have been looking at the inconsistent approach to durability issues and how this can be addressed. To highlight this work the Concrete Institute of Australia is presenting a National Seminar Series on the Durability of Concrete Structures with 5 of these leading concrete durability experts delivering a comprehensive series of presentations aligned with the major design issues.

This seminar is an ideal opportunity for designers, contractors, asset owners and suppliers, in all sectors and of all levels of experience, to develop a more consistent approach to durability and to discuss issues of current and future significance with a collection of Australian durability experts, rarely available in one room at the same time.

These 6 topics, presented by our panel of experts, will be discussed, with a view to developing a more consistent approach in the industry:

  • Durability Planning
  • Durability Exposure Zones
  • Design for Compliance
  • Construction for Durability
  • Durability Modelling
  • Durability Testing

Delegates not only get a chance to learn from, debate with, and discuss the issues with our experts, but will also receive two brand new documents on Durability produced by the Institute, as well as being the first to receive draft Durability documents, open for public comment from members and non-members attending the seminar.

With only a few days left to register, don’t miss out on this opportunity to listen to some of the best durability experts in the country!

Register here!

Posted in Concrete, Newton | Tagged , , | Leave a comment