Regular expression for Excel cell with designation R1C1
I need some code to check if a cell contains a formula with a reference to another cell.
I found the answer Find all used references in Excel formula , but the solution does not match also the formula with references to table columns like:
=SearchValInCol2(Tabella1[articolo];[@articolo];Tabella1[b])
Then I wrote the following VBA code using the Like operator , but definitely a regex solution would be more robust (I think the following code won't work in many scenarios).
Private Function TestIfCellContainsAFormula(cellToTest As Variant) As Boolean
Dim result As Object
Dim r As Range
Dim testExpression As String
Dim objRegEx As Object
Set r = cellToTest ' INPUT THE CELL HERE , e.g. RANGE("A1")
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True
objRegEx.Pattern = """.*?""" ' remove expressions
testExpression = CStr(r.FormulaR1C1)
' search for pattern "=R[-3]C+4"
If testExpression Like "*R[[]*[]]*C*" Then
TestIfCellContainsAFormula2 = True
Exit Function
End If
' search for pattern "=RC[2]"
If testExpression Like "*R*C[[]*[]]*" Then
'If InStr(1, testExpression, "C[", vbTextCompare) <> 0 Then
TestIfCellContainsAFormula2 = True
Exit Function
End If
TestIfCellContainsAFormula2 = False
End Function
source to share
Option 1
To match R1C1 style references, you can use this regex :
R(\[-?\d+\])C(\[-?\d+\])|R(\[-?\d+\])C|RC(\[-?\d+\])
See the railroad diagram for a visual explanation:
There is an "offset" in the kernel -?\d+
, which is optional -
, followed by a digit or more. This sequence is enclosed in parentheses ( []
) to give \[-?\d+\]
. The regex then lets you combine:
-
R[offset]C[offset]
-
R[offset]C
or (|
) -
RC[offset]
or (|
)
Option 2
The regex above will not match R
, C
or RC
. He will meet the R[0]
, C[0]
, R[0]C
, RC[0]
and R[0]C[0]
, which are equivalent. To eliminate these matches, you can use this regular expression :
R(\[-?[1-9][0-9]*\])C(\[-?[1-9][0-9]*\])|R(\[-?[1-9][0-9]*\])C|RC(\[-?[1-9][0-9]*\])
But it seems that typing R[0]
, C[0]
and R[0]C[0]
in my Excel (v2013) turns them into R
, C
and RC
anyway - so you can avoid the extra complexity if it's not a concern.
Option 3
If you want to allow R
, C
and RC
, you can use a simpler regex:
R(\[-?\d+\])?C(\[-?\d+\])?
VBA test code
In this case, option 1 is used.
Option Explicit
Sub Test()
Dim varTests As Variant
Dim varTest As Variant
Dim varMatches As Variant
Dim varMatch As Variant
varTests = Array("RC", _
"R[1]C", _
"RC[1]", _
"R[1]C[1]", _
"R[-1]C", _
"RC[-1]", _
"R[-1]C[-1]", _
"=SUM(A1:B2)", _
"RC[1]+R[-1]C+R[2]C[-99]", _
"R[-1]C-R[1]C[-44]-RC[999]+R[0]C[0]", _
"SearchValInCol2(Tabella1[articolo];[@articolo];Tabella1[b])")
For Each varTest In varTests
varMatches = FormulaContainsR1C1Reference(CStr(varTest))
Debug.Print "Input: " & CStr(varTest)
Debug.Print VBA.String(Len(CStr(varTest)) + 7, "-")
If IsEmpty(varMatches) Then
Debug.Print "No matches"
Else
Debug.Print UBound(varMatches) & " matches"
For Each varMatch In varMatches
Debug.Print varMatch
Next varMatch
End If
Debug.Print vbCrLf
Next varTest
End Sub
Function FormulaContainsR1C1Reference(ByVal strFormula As String) As Variant
Dim objRegex As Object
Dim strPattern As String
Dim objMatches As Object
Dim varMatches As Variant
Dim lngCounter As Long
Set objRegex = CreateObject("VBScript.RegExp")
With objRegex
' setup regex
.Global = True
.IgnoreCase = False
.Pattern = "R(\[-?\d+\])C(\[-?\d+\])|R(\[-?\d+\])C|RC(\[-?\d+\])"
' get matches
Set objMatches = .Execute(strFormula)
' iterate matches
If objMatches.Count > 0 Then
ReDim varMatches(1 To objMatches.Count)
For lngCounter = 1 To objMatches.Count
varMatches(lngCounter) = objMatches.Item(lngCounter - 1)
Next lngCounter
Else
varMatches = Empty
End If
End With
FormulaContainsR1C1Reference = varMatches
End Function
A1 style attributes
I posted the regex here for A1 style references:
^(?:[A-Z]|[A-Z][A-Z]|[A-X][A-F][A-D])(?:[1-9]|[1-9][0-9]|[1-9][0-9][0-9]|[1-9][0-9][0-9][0-9]|[1-9][0-9][0-9][0-9][0-9]|[1-9][0-9][0-9][0-9][0-9][0-9]|10[0-3][0-9][0-9][0-9][0-9]|104[0-7][0-9][0-9][0-9]|1048[0-4][0-9][0-9]|10485[0-6][0-9]|104857[0-6])$
source to share