Regex.Matches skipping a match? FROM#

I need to identify substrings found in a string, for example:

"CityBCProcess test" or "test for city queries"

To obtain:

["City / Town", "ABC", "Process", "Test"]

  • The first line in a substring can be lowercase or uppercase.
  • Any substring with repeated uppercase letters will be a substring until a lowercase letter or space is found "ABCProcess → ABC, ABC Process → ABC"
  • If there is an uppercase letter followed by a lowercase letter, the substring will be everything up to the next uppercase letter.

Regular expression used:

"[AZ] [AZ] + | ([AZ] | [0-9]) + \ b | [AZ] + (= [AZ]?) | ([AZ] | [0-9]) +"

This works fine, but breaks in case of the line:

"X-999"

We will implement it this way:

        StringBuilder builder = new StringBuilder();
        builder.Append("[A-Z][a-z]+|([A-Z]|[0-9])+\b|[A-Z]+(?=[A-Z])|([a-z]|[0-9])+");

        foreach (Match match in Regex.Matches(name, builder.ToString()))
        {
            //do things with each match
        }

      

The problem here is that it doesn't match with "X" but only "999". Any ideas? I tested it with regexr.com and it says that this regex must match two substrings.

+3


source to share


1 answer


\b

interpreted as an escape sequence (\ u0008, backspace) in a C # string.

Escape the forward slash (i.e. \\b

), or use a shorthand using the symbol @

:



        builder.Append(@"[A-Z][a-z]+|([A-Z]|[0-9])+\b|[A-Z]+(?=[A-Z])|([a-z]|[0-9])+");

      

+4


source







All Articles