Excel macro to replace part of a string
I am having trouble replacing part of a string in a data range that contains comments.
If the ID numbers appear, I need to replace the middle of the ID numbers with Xs (for example 423456789
, to become 423xxx789
). Identifiers only start with 4
or 5
, and any other number should be ignored as it may be required for other purposes.
Unfortunately, since these are comments, the data is incompatible with formatting, which adds a layer of complexity.
Representative data looks like this:
523 123 123
523123123
ID 545 345 345 is Mr. Jones
Primary ID 456456456 for Mrs. Brown
Mr. Smith Id is 567567567
I need a code to replace only the middle 3 digits of the ID number and leave the rest of the cell intact so that
ID 545 345 345 is Mr. Jones
Primary ID 456456456 for Mrs. Brown
Becomes (with or without spaces around X
s)
ID 545 xxx 345 is Mr. Jones
Primary ID 456xxx456 for Mrs. Brown
The regex I have finds strings with IDs successfully and works fine for cells with no other text. Unfortunately for other cells, it will not replace only 3 digits that need to be replaced and makes the data a mess of the cell. My code below works for the first two cells above and then doesn't work that well for the rest. Please, help.
Sub FixIds()
Dim regEx As New RegExp
Dim strPattern As String: strPattern = "([4][0-9]{2})([^a-zA-Z0-9_]?[0-9]{3})([^a-zA-Z0-9_]?[0-9]{3})|([5][0-9]{2})([^a-zA-Z0-9_]?[0-9]{3})([^a-zA-Z0-9_]?[0-9]{3})"
Dim strReplace As String: strReplace = ""
Dim strInput As String
Dim Myrange As Range
Dim NewPAN As String
Dim Aproblem As String
Dim Masked As Long
Dim Problems As Long
Dim Total As Long
'Set RegEx config/settings/properties
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = strPattern ' sets the regex pattern to match the pattern above
End With
Set Myrange = Selection
MsgBox ("The macro will now start masking IDs identified in the selected cells only.")
' Start masking the IDs
For Each cell In Myrange
Total = Total + 1
' Check that the cell is long enough to possibly be an ID and isn't already masked
Do While Len(cell.Value) > 8 And Mid(cell.Value, 5, 1) <> "x" And cell.Value <> Aproblem
If strPattern <> "" Then
cell.NumberFormat = "@"
strInput = cell.Value
NewPAN = Left(cell.Value, 3) & "xxx" & Right(cell.Value, 3)
strReplace = NewPAN
' Depending on the data, fix it
If regEx.Test(strInput) Then
cell.Value = NewPAN
Masked = Masked + 1
Else
' Adds the cell value to a variable to allow the macro to move past the cell
Aproblem = cell.Value
Problems = Problems + 1
' Once the macro is trusted not to loop forever, the message box can be removed
' MsgBox ("Problem. Regex fail? Bad data = " & Aproblem)
End If
End If
Loop
Next cell
' All done
MsgBox ("IDs are now masked" & vbCr & vbCr & "Total cells highlighted (including blanks) = " & Total & vbCr & "Cells masked = " & Masked & vbCr & "Problem cells = " & Problems)
End Sub
source to share
I removed the loop Do... While
and changed the logic in the code For Each cell In Myrange
to handle matches one by one and create specific replacements if we have a non-empty value in the first or fourth capturing group (we can choose which values to pick for replacement, then).
For Each cell In Myrange
Total = Total + 1
' Check that the cell is long enough to possibly be an ID and isn't already masked
If strPattern <> "" Then
cell.NumberFormat = "@"
strInput = cell.Value
' Depending on the data, fix it
If regEx.test(strInput) Then
Set rMatch = regEx.Execute(strInput)
For k = 0 To rMatch.Count - 1
toReplace = rMatch(k).Value
If Len(rMatch(k).SubMatches(0)) > 0 Then ' First pattern worked
strReplace = rMatch(k).SubMatches(0) & "xxx" & Trim(rMatch(k).SubMatches(2))
Else ' Second alternative is in place
strReplace = rMatch(k).SubMatches(3) & "xxx" & Trim(rMatch(k).SubMatches(5))
End If
cell.Value = Replace(strInput, toReplace, strReplace)
Masked = Masked + 1
Next k
Else
' Adds the cell value to a variable to allow the macro to move past the cell
Aproblem = cell.Value
Problems = Problems + 1
' Once the macro is trusted not to loop forever, the message box can be removed
' MsgBox ("Problem. Regex fail? Bad data = " & Aproblem)
End If
End If
Next cell
Here's the result:
source to share