Custom character class in C # regex

Is there a way to define a custom character class in C # regex?

Flex does this in a very obvious way:

DIGIT    [0-9]
%%
{DIGIT}+    {printf( "An integer: %s (%d)\n", yytext, atoi( yytext ) );}

      

http://westes.github.io/flex/manual/Simple-Examples.html#Simple-Examples

As explained in this answer , in PHP, defining a custom character class works like this:

(?(DEFINE)(?<a>[acegikmoqstz@#&]))\g<a>(?:.*\g<a>){2}

      

Is there a way to achieve this result in C # without repeating the complete definition of the symbol class every time it is used?

+3


source to share


2 answers


Custom character classes are not supported in C #, but you can use named blocks and character class subtraction to get a similar effect.

.NET defines a large number of named blocks that correspond to categories of Unicode characters, such as mathematical or Greek characters. There may be a block that already suits your requirements.

Character class subtraction allows you to exclude characters from one class or block from characters in a wider class. Syntax:



[ base_group -[ excluded_group ]]

      

The following example, copied from the linked documentation, matches all Unicode characters except spaces, Greek characters, punctuation marks, and newlines:

[\u0000-\uFFFF-[\s\p{P}\p{IsGreek}\x85]]

      

+3


source


No, not supported in C #. This link will give you a good overview of the .NET Regex engine. Note, there is nothing stopping you from defining variables and using them to construct a Regex string:



var digit = "[0-9]";
var regex = new Regex(digit + "[A-Z]");

      

+2


source







All Articles