Matching lines with re.match don't work

From this link I used the following code:

my_other_string = 'the_boat_has_sunk'
my_list = ['car', 'boat', 'truck']
my_list = re.compile(r'\b(?:%s)\b' % '|'.join(my_list))
if re.match(my_list, my_other_string):
    print('yay')

      

However, this doesn't work. I tried to print my_list after re.compile and it prints this:

re.compile('\\b(?:car|boot|truck)\\b')

      

What am I doing wrong?

+3


source to share


2 answers


This is not a regular sentence where words are combined with an underscore. Since you are just checking if a word is present, you can either remove \b

(since it matches at the word boundary and _

is a word character!) Or add alternatives:

import re
my_other_string = 'the_boat_has_sunk'
my_list = ['car', 'boat', 'truck']
my_list = re.compile(r'(?:\b|_)(?:%s)(?=\b|_)' % '|'.join(my_list))
if re.search(my_list, my_other_string):
    print('yay')

      

See IDEONE demo

EDIT



Since you say it should be true if one of the words in the list is in a string, not just as a single word, but it doesn't match, if, for example, boathouse is in a string, I suggest replacing first without words and _

with a space and then using the regex you had with \b

:

import re
my_other_string = 'the_boathouse_has_sunk'
my_list = ['car', 'boat', 'truck']
my_other_string = re.sub(r'[\W_]', ' ', my_other_string)
my_list = re.compile(r'\b(?:%s)\b' % '|'.join(my_list))
if re.search(my_list, my_other_string):
    print('yay')

      

It won't print yay

, but if you delete house

it will.

See IDEONE Demo 2

+3


source


re.match

matches only the beginning of the input line for the regular expression. So this will only work for line starting with lines from my_list

.

re.search

, on the other hand, searches the entire string to match a regular expression.

import re

my_list = ['car', 'boat', 'truck']
my_other_string = 'I am on a boat'

my_list = re.compile(r'\b(?:%s)\b' % '|'.join(my_list))
if re.search(my_list, my_other_string):#changed function call here
    print('yay')

      



For the line "I'm on a boat" re.match

will fail because the beginning of the line is "I" which doesn't match the regex. re.search

will also not match the first charecter, but instead will loop through the string until it hits the "boat", after which it will find a match.

If instead use the string "Boat is what I find" re.match

and re.search

will match the regex for the string, because now the string starts with a match.

+5


source







All Articles