Custom character class in C # regex

Question

Custom character class in C # regex

Is there a way to define a custom character class in C # regex?

Flex does this in a very obvious way:

DIGIT    [0-9]
%%
{DIGIT}+    {printf( "An integer: %s (%d)\n", yytext, atoi( yytext ) );}

http://westes.github.io/flex/manual/Simple-Examples.html#Simple-Examples

As explained in this answer , in PHP, defining a custom character class works like this:

(?(DEFINE)(?<a>[acegikmoqstz@#&]))\g<a>(?:.*\g<a>){2}

Is there a way to achieve this result in C # without repeating the complete definition of the symbol class every time it is used?

+3

c # regex

PiotrB 16 Aug '14 at 15:16

source to share

2 answers

Panagiotis Kanavos · Answer 1 · 2014-08-20T15:12:15+0000

Custom character classes are not supported in C #, but you can use named blocks and character class subtraction to get a similar effect.

.NET defines a large number of named blocks that correspond to categories of Unicode characters, such as mathematical or Greek characters. There may be a block that already suits your requirements.

Character class subtraction allows you to exclude characters from one class or block from characters in a wider class. Syntax:

[ base_group -[ excluded_group ]]

The following example, copied from the linked documentation, matches all Unicode characters except spaces, Greek characters, punctuation marks, and newlines:

[\u0000-\uFFFF-[\s\p{P}\p{IsGreek}\x85]]

Haney · Answer 2 · 2014-08-20T14:59:50+0000

No, not supported in C #. This link will give you a good overview of the .NET Regex engine. Note, there is nothing stopping you from defining variables and using them to construct a Regex string:

var digit = "[0-9]";
var regex = new Regex(digit + "[A-Z]");

Custom character class in C # regex

More articles: