Bash -delete if line exists between two patterns

I am trying to use sed

to accomplish the following. Let's say I have the following file (note: my actual file is more complex than this):

hello world
foo bar
people people
target
something
done

      

I want to check if there is target

between the two patterns in this example between the lines foo bar

and done

(both lines inclusive) and remove the entire pattern if exists target

.

I know how to delete lines between two patterns using this command sed

:

sed '/people.*/,/done/d' file

      

But I only want to remove it if the string target

exists between two string matches.

My logic was something like this:

sed -n '/people.*/,/done/p' file | check if target string exists | delete entire pattern found by sed

      

EDIT

I forgot to mention that there target

can be any number of words before and after target

on the same line.

+3


source to share


4 answers


Sed

This will remove from $start

before $end

if it finds in it $pattern

:

sed ":a;N;\$!ba; s/$start.*$pattern.*$end//g"

There are two steps (instructions) here:

  • Read the entire file as one line (may be bad depending on the file size). For a very good explanation, refer fooobar.com/questions/10927 / ... . The only difference is the extra backtrack before the $! Ba to make it work with double quotes, which is useful for passing Bash variables inside a sed line.
  • Plain old search / replace.


Perl

To handle incompatible matches if Perl is allowed, use:

perl -0777 -p -e 's/$start.*?$pattern.*?$end//s'

This will also read the entire file as a string. The / s at the end says that it includes newlines as part of the regex. Use. * Instead of. *? return to greedy search.

+4


source


sed is a great tool for simple one-line substitutions, but all of its constructs for handling multiple lines became obsolete in the mid-1970s when awk was invented, so just use awk for simplicity, clarity, reliability, etc. with GNU awk for multi-char RS:



$ awk -v RS='^$' '{sub(/\nfoo bar\n.*target.*\ndone\n/,""); print}' file
hello world

      

+2


source


A way to do this without reading the entire file in memory first and prompting greedy match problems if the file contains done

multiple times

sed '/^people/ { :loop; N; /\ndone/ ! b loop; /target/ d }' filename

      

On Mac OS X, it seems necessary to have a newline before the closing parenthesis, so you can wrap the code in a multi-line string literal:

sed '/^people/ { :loop; N; /\ndone/ ! b loop; /target/ d 
}' filename

      

Or put this (more readable anyway) version of the code in a file, say foo.sed

and use sed -f foo.sed filename

:

/^people/ {
  :loop
  N
  /\ndone/ ! b loop
  /target/ d
}

      

The code works like this:

/^people/ {

      

On a line starting with "people"

  :loop
  N
  /\ndone/ ! b loop

      

select more lines in a loop until it starts with done

(this will be the first time it \ndone

appears in the template space)

  /target/ d

      

If there target

's somewhere in it all, throw it all away

}

      

otherwise, it executes as usual (it means printing the pattern space, because we haven't gone -n

to sed).

One possible improvement in reliability is

sed '/^people/ { :loop; N; /\ndone$/! { $! b loop }; /target/ d }' filename

      

or

/^people/ {
  :loop
  N
  /\ndone/ ! {
    $ ! b loop
  }
  /target/ d
}

      

with change /\ndone$/! { $! b loop }

. This will end the loop on the last line of the file, even if not encountered done

, which results in incomplete sections people

at the end of the file not being discarded (unless they contain target

).

+1


source


Late answer

sed '/^foo bar *$/,/^done *$/{/^done *$/!{H;d};/^done *$/{H;g;s/.*//g;x;/.*target.*/d;s/^.//g}}'

      

find all lines between /^foo bar *$/,/^done *$/

/foo bar/,/done/

      

This one /^done *$/!{H;d}

takes all lines from foo bar, but not the last "done" line, and puts it in hold space. subsequently removes those lines from the pattern space.

This one /^done *$/{H;g;s/.*//g;x;

takes the last "done" line and adds it to the hold space. We now have all the lines from foo bar to the line made in hold space. after that we clear everything in the patter space and change the range of lines that are in the hold space with an empty line that is in the template space (this should always keep the hold space empty when targeting another range of lines between "foo bar "and" done ".

finally,

/.*target.*/d 

      

we check if the target is in mutli-pattern space. if so, the range of lines between "foo bar" and "done" will be removed

This avoids reading the entire file as one line

Example

hello world
foo bar
people people
target
something
done
foo bar
.....
.....
.....
done
foo bar
people people
test
something
done

      

results

hello world
foo bar
.....
.....
.....
done
foo bar
people people
test
something
done

      

Note: the range of lines starting with "foo bar" up to the line "done" with a line containing "target" is removed

+1


source







All Articles