Delete text between two quotes in Perl?

I thought I figured it out, but I want to find all entries in a file where I have text to delete between two double quotes.

I need to find a match first and then get everything from the first double quote to the match and then all the text into the second double quote and remove it. I don't want to just get the text between the two double quotes as it might not be something in this file that I want to remove.

I used something like this:

perl -p -i.bak -e s/bar/foo/g bar.xml

      

first do the search and replace that worked. Then I went:

perl -p -i.bak -e s/..\/..\/bar\//g bar.xml

      

and that removed everything up to the bar, but I need to continue with the whole second double quote and I'm not sure how to do it with Perl.

I am guessing it will be some kind of regex, but I haven't tried anything. The part before the bar will always be the same, but the text will change after that point, however, it will always end with the second double quote for the part I want to remove. After that, the text will appear.

+3


source to share


3 answers


s/"[^"]*foo[^"]*"//g

      

works if there are no escaped quotes between the actual quotes, and if you want to remove the quoted string containing foo

:



"      # Match a quote
[^"]*  # Match any number of characters except quotes
foo    # Match foo
[^"]*  # Match any number of characters except quotes
"      # Match another quote

      

+5


source


Some people have asked about fluent quotes. There are a couple of tricks in there. You want to ignore hidden quotes like \"

, but not quote characters that have an escape escape, like \\"

. To ignore the former, I use a negative look and feel. In order not to ignore the second, I temporarily change everything \\

to 😺. If you have 😺 in your data, choose something else.

use v5.14;
use utf8;
use charnames qw(:full);

my $regex = qr/
    (?<!\\) "  # a quote not preceded by a \ escape
    (.*?)      # anything, non greedily
    (?<!\\) "  # a quote not preceded by a \ escape
    /x;

while( <DATA> ) {
    # encode the escaped escapes for now
    s/(?:\\){2}/\N{SMILING CAT FACE WITH OPEN MOUTH}/g;
    print "$.: ", $_;

    while( m/$regex/g ) {
        my $match = $1;
        # decode the escaped escapes
        $match =~ s/\N{SMILING CAT FACE WITH OPEN MOUTH}/\\\\/g;
        say "\tfound β†’ $match";
        }
    }

__DATA__
"One group" and "another group"
This has "words between quotes" and words outside
This line has "an \" escaped quote" and other stuff
Start with \" then "quoted" and "quoted again"
Start with \" then "quoted \" with escape" and \" and "quoted again"
Start with \" then "quoted \\" with escape"
Start with \" then \\\\"quoted \\" with escape\\"

      



Output:

1: "One group" and "another group"
    found β†’ One group
    found β†’ another group
2: This has "words between quotes" and words outside
    found β†’ words between quotes
3: This line has "an \" escaped quote" and other stuff
    found β†’ an \" escaped quote
4: Start with \" then "quoted" and "quoted again"
    found β†’ quoted
    found β†’ quoted again
5: Start with \" then "quoted \" with escape" and \" and "quoted again"
    found β†’ quoted \" with escape
    found β†’ quoted again
6: Start with \" then "quoted 😺" with escape"
    found β†’ quoted \\
7: Start with \" then 😺😺"quoted 😺" with escape😺"
    found β†’ quoted \\

      

+2


source


You enter a file .xml

- so I'm going to tell you what I usually do.

Using an XML parser - I like XML::Twig

it because I think it's easier to get around the first time. XML::LibXML

also good.

Now, based on the question you are asking, it looks like you are trying to rewrite the file path in an XML attribute.

So:

#!/usr/bin/env perl/

use strict;
use warnings;

use XML::Twig;

#my $twig = XML::Twig -> parsefile ( 'test.xml');
my $twig = XML::Twig -> parse ( \*DATA );

foreach my $element ( $twig -> get_xpath('element[@path]') ) {
   my $path_att = $element -> att('path');
   $path_att =~ s,/\.\./\.\./bar/,,g;
   $element -> set_att('path', $path_att);
}

$twig -> set_pretty_print('indented_a');
$twig -> print;
__DATA__
<root>
   <element name="test" path="/path/to/dir/../../bar/some_dir">
   </element>
   <element name="test2" nopath="here" />
   <element path="/some_path">content</element>
</root>

      

XML::Twig

it is also quite useful to maintain a parsefile_inplace

"sed style" for modifying a file. The above is an illustration of a concept with some sample XML

- with a clearer example of what you are trying to do, I should be able to improve it.

0


source







All Articles