How do I fix this regex in python?

Question

How do I fix this regex in python?

I want to process some string date which is printed like this

'node0, node1 0.04, node8 11.11, node14 72.21\n'
'node1, node46 1247.25, node6 20.59, node13 64.94\n'

I want to find all floats here, this is the code I am using

for node in nodes
    pattern= re.compile('(?<!node)\d+.\d+')
    distance = pattern.findall(node)

however the result is the same

['0.04', '11.11', '4 72']

while i want it

['0.04', '11.11', '72.21']

Any suggestion on fixing this regex?

+3

python regex

Kaifan Deng May 29 '15 at 19:10

source to share

2 answers

.

in your expression is not displayed.

for node in nodes:
    pattern = re.compile(r"(?<!node)\d+\.\d+")
    distance = pattern.findall(node)

+4

Navith May 29 '15 at 19:13

source to share

GreenMatt · Accepted Answer · 2015-05-29T19:19:12+0000

In regular expressions, a character is .

interpreted as a wildcard and can match (almost) any character. So your search pattern actually allows a digit or a set of digits, followed by any character, followed by another digit or set of digits. To stop this interpretation of the dot character, print it with a backslash \

.

(As an aside: you don't need to compile the regex pattern inside your loop. This will actually slow down your code.)

pattern = re.compile('(?<!node)\d+\.\d+')
for node in nodes:
    distance = pattern.findall(node)
    print distance

output:

['0.04', '11 .11 ', '72 .21']
['1247.25', '20 .59 ', '64 .94']

How do I fix this regex in python?

More articles: