Perl regex does not match line with newline character \ n
I am trying to use perl (v5.14.2) through bash shell (GNU Bash -4.2) in Kubuntu (GNU / Linux) to find and replace a line containing a newline character, but I have not succeeded yet.
Here's the text file I'm looking for:
<!-- filename: prac1.html -->
hello
kitty
blah blah blah
When I use the text editor (Kate's) find and replace feature or when I use the regex tester ( http://regexpal.com/ ) I can easily get this regex to work:
hello\nkitty
But when using perl on the command line, none of the following commands worked:
perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello.*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html
Actually, I got desperate and tried many other patterns, including all these (different permutations in "single line" and "multi-line" modes):
perl -p -i -e 's,hello\nkitty,newtext,' prac1.html
perl -p -i -e 's,hello.kitty,newtext,' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,s' prac1.html
perl -p -i -e 's,hello.kitty,newtext,s' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,m' prac1.html
perl -p -i -e 's,hello.kitty,newtext,m' prac1.html
perl -p -i -e 's,hello\nkitty,newtext,ms' prac1.html
perl -p -i -e 's,hello.kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,s' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,m' prac1.html
perl -p -i -e 's,hello[\S\s]kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello[\S\s]*kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]^kitty,newtext,ms' prac1.html
perl -p -i -e 's,hello$[\S\s]*^kitty,newtext,ms' prac1.html
(I've also tried using \ r \ r \ n \ R \ f \ D etc. and global mode.)
Can anyone identify the problem or suggest a solution?
source to share
Try to do this, making it possible by changing the input record separator (by default it is a newline):
perl -i -p00e 's,hello\nkitty,newtext,' prac1.html
from perldoc perlrun
:
-0 [octal / hex]
specifies the input separator ($ /) as an octal or hexadecimal number. If there are no digits, the null character is the delimiter. Other switches can precede or follow numbers. For example, if you have a version of search that can print null-terminated filenames, you could say this:
find . -name '*.orig' -print0 | perl -n0e unlink
The special value 00 will cause Perl to process files in paragraph mode . Any value of 0400 or higher will cause Perl to truncate entire files, but the convention, a value of 0777 is the one commonly used for this purpose.
source to share
The problem is that "-p" has already implicitly wrapped this loop around your "-e", and "<>" splits the input into lines, so your regex will never get a chance to see more than one line.
LINE:
while (<>) {
... # your program goes here
} continue {
print or die "-p destination: $!\n";
}
See the perlrun man page for more information.
source to share