Need to test "\\" (backslash) in this Reg Ex

I am currently using this reg ex:

"\bI([ ]{1,2})([a-zA-Z]|\d){2,13}\b"

      

I was just wondering that the text I am using for this might contain " \

" (backslash). How do I add this expression to an expression?

0


source to share


4 answers


Add |\\

inside the group, after \d

eg.



+2


source


This expression can be simplified if you also allow the underscore in the second capture register and you are ready to use metacharacters. This changes this:

([a-zA-Z]|\d){2,13}

      

in it...

([\w]{2,13})

      

and you can also add a test for the backslash character with this ...



([\w\x5c]{2,13})

      

which makes the regex a little easier on the eyeball, depending on your personal preference.

"\bI([\x20]{1,2})([\w\x5c]{2,13})\b"

      

See also:

+1


source


Both @ slavy13 and @dreftymac give you a basic pointer solution, but ...

  • You can use \d

    inside a character class to denote a digit.
  • You don't need to put a space in the character class to match it (except perhaps for clarity, although this is debatable).
  • You can use [:alpha:]

    inside a character class to mean alpha character, [:digit:]

    mean number, and [:alnum:]

    mean alphanumeric (specifically, not including underscore, unlike \w

    ). Note that these character classes can mean more characters than you expect; think about accented characters and non-Arabic numerals, especially in Unicode.
  • If you want to capture all the information after the space, you need to repeat inside the parentheses.

Contrast behavior of these two one-liners:

perl -n -e 'print "$2\n" if m/\bI( {1,2})([a-zA-Z\d\\]){2,13}\b/'

perl -n -e 'print "$2\n" if m/\bI( {1,2})([a-zA-Z\d\\]{2,13})\b/'

      

Given the input string " I a123

", the first prints "3" and the second prints "a123". Obviously, if all you wanted were the last character of the second part of the string, then the original expression would be fine. However, this is unlikely to be required. (Obviously, if you are only interested in this batch, then using " $&

" gives consistent text, but has negative effects on efficiency.)

I would probably use this regex as it seems to me better:

m/\bI( {1,2})([[:alnum:]\\]{2,13})\b/

      

Time for a must-have: read Jeff Friedl's Mastering Regular Expressions .

0


source


As I indicated in my comments on the post slavy, \\

\b

, because the backslash is not a word character. Therefore my suggestion

/\bI([ ]{1,2})([\p{IsAlnum}\\]{2,13})(?:[^\w\\]|$)/ 

      

I assumed you wanted to capture as many as 2-13 characters, not just the first one that applies, so I adjusted the RE.

You can do the last gaze capture if the engine supports it and you don't want to use it. It will look like this:

/\bI([ ]{1,2})([\p{IsAlnum}\\]{2,13})(?=[^\w\\]|$)/ 

      

0


source







All Articles