Sed cannot match 0 or 1 times

I am writing a bash script on Cent OS7. Now I need to use sed

to delete all lines that do not contain .jpg

or .jpeg

.

Here's my script:

sed -i -e '/\.jp(e)?g/!d' myfile

      

But it will delete all lines, which means it doesn't work as expected.

However, if I do sed -i -e '/\.jpg/!d' myfile

or sed -i -e '/\.jpeg/!d' myfile

. They both work well.

+3


source to share


3 answers


The captured group ( ()

) and quantifier ?

(matches the previous token 0 or 1 times) comes (at least) with an ERE (Extended RegEx), not a BRE (Basic RegEx).

sed

uses BRE by default, so tokens are treated literally.

To enable ERE use -E

(or -r

if available) with sed

:

sed -E '/\.jp(e)?g/!d' myfile

      

The capture e

is redundant here:



sed -E '/\.jpe?g/!d' myfile

      


Note that you can use ERE markers from BRE by escaping them with \

, so the following will work:

sed '/\.jp\(e\)\?g/!d' myfile
sed '/\.jpe\?g/!d' myfile

      

Again, this doesn't look as easy as just one option, i.e. -E

... The only case you want is portability.

+5


source


Using regex in the command sed

may match your requirement, zero or one 'e' will be filtered out as shown below.



sed -i -e '/jpe\?g/!d' myfile

0


source


This might work for you (GNU sed):

sed '/\.jp\(e\|\)g/!d' file

      

Use alternation when one of the surrogates is empty.

It might be easier to see if there are alternatives:

sed '/\.jpeg\|\.jpg/!d' file

      

However, as already said, use ?

:

sed '/\.jpe\?g/!d' file

      

NB *

is zero or more, i.e.

sed '/\.jpe*g/!d' file

      

will allow .jpeeeeeeeeeeeeeeeeg

0


source







All Articles