Regex matches an n-quoted string

It's pretty easy to match a string with escape sequences:

"(\\.|[^"])*"

      

But what if I wanted to support not only Python ( """hello"""

) triple-quoted strings, but any number of quoted strings as long as the beginning matches the end?

("+)(.*?)\1

      

Let's do it, but I will lose support for escape sequences, since in "hello\"world"

must match exactly.

Other examples that should be fully consistent:

  • """hello world"""""

    (string ends with two quotes)
  • ""hello\""world""

    (there are two quotes in the middle, but one is escaped and the other is not enough to end the string)

Is this possible with regular expressions?

+3


source to share


2 answers


How about this:

^("+)((.*?)(?<!\\)(?:\\\\)*)\1$

      



I had a simpler expression before, but failed on "hello world \\"

, so I updated the lookbehinds that allow an even number of backslashes to be present before the closing quotes, but do not allow an odd number of backslashes. The Regex syntax used is PCRE.

Fiddle here .

+1


source


This expression matches all of your examples:

("+)(.*[\\"]?.*?)+?\1

      



see http://regex101.com/r/jW8iV8/1

0


source







All Articles