Nested regex
I have a string with alphanumeric values. Numeric values are variable. Alphabetic values are always 'abc'
and 'ghi'
, but I don't know their order. Numerical values always appear after alphabetic values.
Valid examples of this type of string are:
a = 'abc10ghi1450' b = 'abc11ghi9285' c = 'ghi1abc9' ...
Now I want to store the numbers after 'abc'
and 'ghi'
in the corresponding variables, and what I do is:
>>> import re
>>> string = 'abc10ghi44'
>>> abc = re.search('abc\d+', string).group(0)
>>> abc = re.search('\d+', abc).group(0)
>>> ghi = re.search('ghi\d+', string).group(0)
>>> ghi = re.search('\d+', ghi).group(0)
>>> print abc, ghi
10, 44
I use 2 regex for each variable and I don't like it; is there a smarter way to do the same?
source to share
Yes, make a capturing group around the numbers and use this:
>>> import re
>>> string = 'abc10ghi44'
>>> re.search('abc(\d+)', string).group(1)
'10'
Notice the parentheses around \d+
and 1
in the call group
.
Alternatively, use a positive lookbehind:
>>> re.search('(?<=abc)\d+', string).group(0)
'10'
source to share