Regex finds non digit and / or end of string

How do I include the end of a line and one non-digit character in a python 2.6 regex set to search?

I want to find ten digit numbers with no digit at the beginning and no digit or end of string at the end. This is a 10 digit ISBN number, and "X" is valid for the last digit.

The following steps don't work:

is10 = re.compile(r'\D(\d{9}[\d|X|x])[$|\D]')
is10 = re.compile(r'\D(\d{9}[\d|X|x])[\$|\D]')
is10 = re.compile(r'\D(\d{9}[\d|X|x])[\Z|\D]')

      

The problem occurs with the last set: [\ $ | \ D] to match a non-digit or trailing string.

Test using

line = "abcd0123456789"
m = is10.search(line)
print m.group(1)

line = "abcd0123456789efg"
m = is10.search(line)
print m.group(1)

      

+2


source to share


2 answers


You need to group the alternatives with parentheses, not parentheses:

r'\D(\d{9}[\dXx])($|\D)'

      

|

is a different construction than []

. It denotes an alternative between two patterns and []

matches one of the contained characters. Therefore, |

it should only be used internally []

if you want to match the actual symbol |

. Portions of patterns are grouped using parentheses, so they should be used to limit the scope of the marked alternative |

.



If you want to avoid this, this creates groups of matches, you can use instead (?: )

:

r'\D(\d{9}[\dXx])(?:$|\D)'

      

+4


source


\D(\d{10})(?:\Z|\D)

      



find non-characters followed by 10 digits and also one digit or end of line. Writes only numbers. Although I see that you are looking for nine digits followed by the numbers or X

or X

, I do not see the same in your requirements.

0


source







All Articles