How do I fix this regex in python?

I want to process some string date which is printed like this

'node0, node1 0.04, node8 11.11, node14 72.21\n'
'node1, node46 1247.25, node6 20.59, node13 64.94\n'

      

I want to find all floats here, this is the code I am using

for node in nodes
    pattern= re.compile('(?<!node)\d+.\d+')
    distance = pattern.findall(node)

      

however the result is the same

['0.04', '11.11', '4 72']

      

while i want it

['0.04', '11.11', '72.21']

      

Any suggestion on fixing this regex?

+3


source to share


2 answers


In regular expressions, a character is .

interpreted as a wildcard and can match (almost) any character. So your search pattern actually allows a digit or a set of digits, followed by any character, followed by another digit or set of digits. To stop this interpretation of the dot character, print it with a backslash \

.

(As an aside: you don't need to compile the regex pattern inside your loop. This will actually slow down your code.)

pattern = re.compile('(?<!node)\d+\.\d+')
for node in nodes:
    distance = pattern.findall(node)
    print distance

      



output:

['0.04', '11 .11 ', '72 .21']
['1247.25', '20 .59 ', '64 .94']

+4


source


.

in your expression is not displayed.



for node in nodes:
    pattern = re.compile(r"(?<!node)\d+\.\d+")
    distance = pattern.findall(node)

      

+4


source







All Articles