How to match a regular expression pattern and extract data from it

I can have 0 or many substrings in the text area in the format {key-value}Some text{/key}

,

for example This is my {link-123}test{/link} text area

I would like to iterate over any elements that match this pattern, perform both the key and value based action, and then replace that substring with a new string (the binding that is removed by the key based action).

How can I achieve this in C #?

+3


source to share


3 answers


If these tags are not nested, you only need to iterate over the file once; if nesting is possible, then you need to do one iteration for each nesting level.

This answer assumes that curly braces only occur as tag separators (and not, for example, inside comments):



result = Regex.Replace(subject, 
    @"\{                # opening brace
    (?<key>\w+)         # Match the key (alnum), capture into the group 'key'
    -                   # dash
    (?<value>\w+)       # Match the value (alnum), capture it as above
    \}                  # closing brace
    (?<content>         # Match and capture into the group 'content':
     (?:                # Match...
      (?!\{/?\k<key>)   # (unless there an opening or closing tag
      .                 # of the same name right here) any character
     )*                 # any number of times
    )                   # End of capturing group
    \{/\k<key>\}        # Match the closing tag.", 
    new MatchEvaluator(ComputeReplacement), RegexOptions.Singleline | RegexOptions.IgnorePatternWhitespace);

public String ComputeReplacement(Match m) {
    // You can vary the replacement text for each match on-the-fly
    // m.Groups["key"].Value will contain the key
    // m.Groups["value"].Value will contain the value of the match
    // m.Groups["value"].Value will contain the content between the tags
    return ""; // change this to return the string you generated here
}

      

+2


source


Something like that?



Regex.Replace(text,

  "[{](?<key>[^-]+)-(?<value>[^}])[}](?<content>.*?)[{][/]\k<key>[}]",
  match => {

    var key = match.Groups["key"].Value;
    var value= match.Groups["value"].Value;
    var content = match.Groups["content"].Value;

  return string.format("The content of {0}-{1} is {2}", key, value, content);
});

      

+1


source


Use the .net regex libraries. Here's an example that uses the Matches method:

http://www.dotnetperls.com/regex-matches

For text replacement, consider using a templating engine like Antlr

http://www.antlr.org/wiki/display/ANTLR3/Antlr+3+CSharp+Target

Here is an example from the Matches blog

using System; using System.Text.RegularExpressions;

class Program
{
static void Main()
{
// Input string.
const string value = @"said shed see spear spread super";

// Get a collection of matches.
MatchCollection matches = Regex.Matches(value, @"s\w+d");

// Use foreach loop.
foreach (Match match in matches)
{
    foreach (Capture capture in match.Captures)
    {
    Console.WriteLine("Index={0}, Value={1}", capture.Index, capture.Value);
    }
}
}
}

      

For more information on C # regex syntax, you can use this trickery:

http://www.mikesdotnetting.com/Article/46/CSharp-Regular-Expressions-Cheat-Sheet

0


source







All Articles