How do I match a string with the regexy syntax?

Suppose I have a string like

x = "spam ?and eggs"

      

And I am trying to match this "?and"

. I am currently doing it like this:

>>> print re.findall(re.escape('?and'), x)
['?and']

      

Is this the correct use case for re.escape

? Will it work with any string literal I'm looking for that might contain other types of regex syntax?

My use case with an argument in pexpect.spawn.expect(pattern)

, where the input pattern can be a string type that will compile into a regular expression. In some cases, what I'm looking for may look like a regex, but it is actually a string literal that I want to match.

+3


source to share


3 answers


For pexpect, you can use the expect_exact () function instead of the wait () function to disable the regex functionality and it will match exactly the python string you give it.

From the docs:



expect_exact (self, pattern_list, timeout = -1, searchwindowsize = -1)
This is similar to expect (), but uses a simple string match instead of the compiled regular expressions in 'pattern_list'. "Pattern_list" can be a string; a list or other sequence of lines; or TIMEOUT and EOF.

This call can be faster than expect () for two reasons: the string search is faster than the RE match, and you can restrict find only the end of the input buffer.

This method is also useful when you don't want to worry about avoiding the regular ones you want to match.

+2


source


Yes, that's exactly the right use case for re.escape

- the documentation says it's "useful if you want to match an arbitrary literal string that may contain regex metacharacters in it" - although in your first example, I think it's a little easier to avoid the flagging question yourself using any of these:



re.findall(r'\?and', x)          # \? in a raw string literal
re.findall('\\?and', x)          # \? in a non-raw string literal, so, \\?
re.findall('[?]and', x)          # "cheat" by using a character class

      

+6


source


Yes, that looks right to me. If you are avoiding your entire pattern, this is usually a good sign that you should use find

no regexp.

x.find('?and')

      

It gives -1 or position. So that...

>>> if x.find('?and') != -1: 
...   print "Match!"
... 
Match!

      

0


source







All Articles