Use grep to match and erase the pattern and its previous line in a large chunk of text

Question

Use grep to match and erase the pattern and its previous line in a large chunk of text

I have a very large text file that contains data similar to the following:

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT

craft/NN ,/Fc he/PRP obtain/VBD the/DT ##archbishopric/NN## of/IN besancon/NP ;/Fx and/CC have/VBD it/PRP in/IN
======>match found: \#\#\sof\/IN

succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

klutzy/NN little/JJ ##scene/NN## where/WRB 1/Z brave/JJ french/JJ man/NN refuse/VBZ to/TO sit/VB down/RP for/IN fear/NN of/IN be/VBG discover/VBN ./Fp
======>match found: \#\#\swhere\/WRB\s

I would like to use grep to match and erase all lines containing the string "text", then immediately after a new line character appears with =====> matching:, as in:

craft/NN ,/Fc he/PRP obtain/VBD the/DT ##archbishopric/NN## of/IN besancon/NP ;/Fx and/CC have/VBD it/PRP in/IN
======>match found: \#\#\sof\/IN

and ends with a newline character.

So, as per the previous example, I would like to run grep and get the following output

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT

succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

I've already tried: grep -E -v '^.+\n======>match found:.+$' file.txt

as suggested here by adding regex .+*\n

to the command to include the previous line but it doesn't work, any suggestions?

+3

regex grep

owwoow14 Jan 24 '13 at 9:27

source to share

2 answers

Multi-line grepping is complicated by the fact that traditional grep implementations only count one line at a time, so adding \n

to your pattern doesn't make sense.

If you have pcregrep multi-line collation available, using the flag -M

:

pcregrep -Mv '^.+\n======>match found:.+$'

Output:

he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT


succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

0

Thor Jan 24 '13 at 9:47

source to share

Lev Levitsky · Accepted Answer · 2013-01-24T09:44:26+0000

This command sed

is next to what you want:

$ sed -n 'N;/\n======>match found:/d; P;D' textfile 
he/PRP have/VBD obtain/VBN the/DT ##archbishopric/NN## against/IN the/DT monk/NNS of/IN the/DT


succeed/VBN to/TO the/DT ##archbishopric/NN## ./Fp

Use grep to match and erase the pattern and its previous line in a large chunk of text

More articles: