Use grep to match and erase a pattern in a large chunk of text

Question

Use grep to match and erase a pattern in a large chunk of text

I have a very large text file that contains data similar to the following:

     but/CC as/IN 1/Z church/NP historian/NN/Fc 
     as/IN 1/Z "/Fe rupture/NN and/CC new/JJ beginning/NN century/NN ./Fp
    ======>match found: \#\#[a-z]+\/NN\#\#
    ======>match found: be\/V[A-Z]+(\s[.]{0,10})?\#\#
    ======>match found: \#\#\sof\/IN

I would like to use the (linux) grep terminal command to match and erase all lines starting with:

======> found a match:

and ends with a newline character.

So, as per the previous example, I would like to run grep and get the following output

but / CC as / IN 1 / Z church / NP historian / NN / Fc as / IN 1 / Z "/ Fe gap / NN and / CC new / JJ beginning / NN century /NN./Fp

Thanks in advance for your help

0

regex grep

Albz 23 jan. 13 at 15:14

source to share

2 answers

Sed is your friend

sed -i '/^======>match found:/d' largefilename.txt

will remove all occurrences of lines starting with ======>match found:

Note. The switch -i

means that the value largefilename.txt

will be modified rather than printed to stdout, which should be more efficient than using grep.

+1

beny23 23 jan. 13 at 15:24

source to share

Perleone · Accepted Answer · 2013-01-23T15:23:08+0000

grep -E -v '^======>match found:.+$' file.txt

-E

includes extended regexes while -v

negates output, i.e. prints all lines that do not match.

Use grep to match and erase a pattern in a large chunk of text

More articles: