Where do backreference regexes belong to the Hom hierarchy (do they really belong)?

I'm a little confused here - RegExes with backreferences appear to be not regular expressions because they can, for example, be used to describe a copying language ("ww" for any word w) that is context sensitive, But then however, they still cannot be used to describe context-free languages ​​like HTML (or even suitable parentheses) - at least I don't know what such a thing would look like, for example. POSIX Regular Expressions.

That being said, do these kind of "regular expressions" belong somewhere in the Chomsky hierarchy, or are they some frankenstein abomination between the lines?

+3


source to share


1 answer


They really don't fit.

Backreference modes may match some non-context-free languages ​​(for example (.*)\1

), but also may not match all context -free languages ​​( nested parentheses are typical example).



Here's a related post on the CSTheory StackExchange that has a few details.

Note also that some implementations (such as .NET or Perl) go further than backreferences and may match nested parentheses.

+2


source







All Articles