Regular expression matching problem in newer version of Perl

I have migrated to a new server with Perl 5.22.1 . I have this bit of code:

$html =~ m{
    ( # $1 the whole tag
        <
        (
            ?:
            !--
            ( # $2 the attributes are all the data between
                .*?
            )
            --
            | # or
            (
                ?:
                ( # $3 the name of the tag
                    /?\S+?\b
                )
                ( # $4 the attributes
                    [^'">]*
                    (
                        ?:
                        ( # $5 just to match quotes
                            ['"]
                        )
                        .*?\5
                        [^'">]*
                    )*
                )
            )
        )
        >
    )
}gsx

      

... and now it gives me this error:

A fatal error has occurred:

    In '(?...)', the '(' and '?' must be adjacent in regex; marked by <-- HERE in m/
                ( # $1 the whole tag
                    <
                    (
                        ? <-- HERE :
                        !--
                        ( # $2 the attributes are all the data between
                            .*?
                        )
                        --
                        | # or
                        (
                            ?:
                            ( # $3 the name of the tag
                                /?\S+?\b
                            )
                            ( # $4 the attributes
                                [^'">]*
                                (
                                    ?:
                                    ( # $5 just to match quotes
                                        ['"]
                                    )
                                    .*?\5
                                    [^'">]*
                                )*
                            )
                        )
                    )
                    >
                )
            / at ./admin/GT/HTML/Parser.pm line 207.
    Compilation failed in require at (eval 25) line 8.

Please enable debugging in setup for more details.

      

I'm not really sure what he is complaining about. Any ideas?

+3


source to share


1 answer


You need to make sure that the ?:

(non-adaptive group markers) come immediately after the opening parenthesis , even if a x

modifier is used
.

See the revised regex declaration:

$html =~ m{
    ( # $1 the whole tag
        <
        (?:
            !--
            ( # $2 the attributes are all the data between
                .*?
            )
            --
            | # or
            (?:
                ( # $3 the name of the tag
                    /?\S+?\b
                )
                ( # $4 the attributes
                    [^'">]*
                    (?:
                        ( # $5 just to match quotes
                            ['"]
                        )
                        .*?\5
                        [^'">]*
                    )*
                )
            )
        )
        >
    )
}gsx

      



See this link :

Note that everything inside \Q...\E

does not change to /x

. And note that it /x

does not affect spatial interpretation within a single multi-character construct. For example, in \x{...}

, regardless of the modifier /x

, there should be no spaces. The same goes for a quantifier such as {3}

or {5,}

. Similarly, (?:...)

it cannot have a space between "{"

, "?"

and":"

. Within any delimiters for this design, the allowed spaces are /x

unaffected and are design dependent. For example, \x{...}

it cannot have spaces because hexadecimal numbers do not contain spaces.

I think there is a typo - there {

should actually be (

. I've highlighted some of the text that is relevant to the current scenario.

+4


source







All Articles