Perl regex does not match line with newline character \ n

Question

Perl regex does not match line with newline character \ n

I am trying to use perl (v5.14.2) through bash shell (GNU Bash -4.2) in Kubuntu (GNU / Linux) to find and replace a line containing a newline character, but I have not succeeded yet.

Here's the text file I'm looking for:

<!-- filename: prac1.html -->

hello
kitty

blah blah blah

When I use the text editor (Kate's) find and replace feature or when I use the regex tester ( http://regexpal.com/ ) I can easily get this regex to work:

hello\nkitty

But when using perl on the command line, none of the following commands worked:

perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello.*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html

Actually, I got desperate and tried many other patterns, including all these (different permutations in "single line" and "multi-line" modes):

perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,s' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,m' prac1.html
perl -p -i -e 's,hello.kitty,newtext,m' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,ms' prac1.html
perl -p -i -e 's,hello.kitty,newtext,ms' prac1.html

perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,ms' prac1.html

(I've also tried using \ r \ r \ n \ R \ f \ D etc. and global mode.)

Can anyone identify the problem or suggest a solution?

+3

regex perl

zeroparallax 16 Feb At 1:09 am

source to share

2 answers

The problem is that "-p" has already implicitly wrapped this loop around your "-e", and "<>" splits the input into lines, so your regex will never get a chance to see more than one line.

 LINE:
       while (<>) {
           ...             # your program goes here
       } continue {
           print or die "-p destination: $!\n";
       }

See the perlrun man page for more information.

+5

Bluby 16 Feb At 1:14

source to share

Gilles quenot · Accepted Answer · 2013-02-16T02:21:51+0000

Try to do this, making it possible by changing the input record separator (by default it is a newline):

perl -i -p00e 's,hello\nkitty,newtext,' prac1.html

from perldoc perlrun

:

-0 [octal / hex]

specifies the input separator ($ /) as an octal or hexadecimal number. If there are no digits, the null character is the delimiter. Other switches can precede or follow numbers. For example, if you have a version of search that can print null-terminated filenames, you could say this:
find . -name '*.orig' -print0 | perl -n0e unlink

      

        
        
        
      

    
The special value 00 will cause Perl to process files in paragraph mode . Any value of 0400 or higher will cause Perl to truncate entire files, but the convention, a value of 0777 is the one commonly used for this purpose.

Perl regex does not match line with newline character \ n

More articles: