Help with regex tag removal
I have lines of the form: "[user: fred] [priority: 3] Lorem ipsum dolor sit amet." where the area enclosed in square brackets is a tag (in the format [key: value]). I need to remove a specific tag if it has a key with the following extension method:
public static void RemoveTagWithKey(this string message, string tagKey) {
if (message.ContainsTagWithKey(tagKey)) {
var regex = new Regex(@"\[" + tagKey + @":[^\]]");
message = regex.Replace(message , string.Empty);
}
}
public static bool ContainsTagWithKey(this string message, string tagKey) {
return message.Contains(string.Format("[{0}:", tagKey));
}
Only the tag with the specified key should be removed from the string. My regex doesn't work because it's stupid. I need help to write it correctly. Alternatively, implementation without regex is encouraged.
source to share
If you want to do it without Regex, it is not difficult. You're already looking for a specific tag key, so you can just search for "[" + tagKey, then search from there to close "]" and delete everything in between those offsets. Something like...
int posStart = message.IndexOf("[" + tagKey + ":");
if(posStart >= 0)
{
int posEnd = message.IndexOf("]", posStart);
if(posEnd > posStart)
{
message = message.Remove(posStart, posEnd - posStart);
}
}
Is it better than Regex solution? Since you are only looking for a specific key, I think this is probably on simplicity. I love Regexes, but they are not always the clearest answer.
Edit: Another reason the IndexOf () solution might be better is that it means there is only one rule for finding the start of a tag, whereas in the source code Contains()
that looks for something like "[tag: "and then uses a regular expression that uses a slightly different expression for replacement / removal. In theory, you can have text that meets one criterion but not another.
source to share
I know there are many more feature-rich tools out there, but I love the simplicity and cleanliness of Code Regx Tester (also YART: Another Regex Tester). Shows groups and snapshots in a tree view, quite fast, very small, open source. It also generates code in C ++, VB, and C #, and can automatically escape or exclude regular expressions for those languages. I put it in the VS tools folder (C: \ Program Files \ Microsoft Visual Studio 9.0 \ Common7 \ Tools) and set a menu item for it in the Tools menu with Tools> External Tools so that I can quickly launch it from VS.
Regular expressions are sometimes very difficult to write, and I know it really helps to be able to validate the regular expressions and see the results as you go.
(source: dotnet2themax.com )
Another really popular (but not free) option is Regex Buddy .
source to share
I think this is the regex you are looking for:
string regex = @"\[" + tag + @":[^\]+]\]";
Plus, you don't need to do a separate check to see if there are tags of this type. Just replace the regular expression; if there is no match, the original string is returned.
public static string RemoveTagWithKey(string message, string tagKey) {
string regex = @"\[" + tag + @":[^\]+]\]";
return Regex.Replace(message, regex, string.Empty);
}
You seem to be writing an extension method, but I wrote it as a static utility method to keep things simple.
source to share