Find all matches of a two-character string in a text file and swap them

Question

Find all matches of a two-character string in a text file and swap them

Searching a text file for an underscore preceded by a punctuation mark --- [.?!;:]_

--- I want to change the order.

For example, given the line

On this _line,_ I show an example. !_

I want to change it to:

On this _line_, I show an example. _!

I can find all cases, say Silver Searcher or ripgrep:

rg '[.?!;:]_' myfile.txt

but I'm not sure how those two characters are then replaced and written to the location or to a new file.

I could just use sed

for each instance of punctuation, for example:

sed -ie 's/,_/_,/g' myfile.txt

then

sed -ie 's/\._/_\./g' myfile.txt

then ...

but it would be nice to accomplish this with a single command.

Is it possible to reference the found instance and use it in the ripgrep option -r ARG

? Or am I barking the wrong tree and wise to use another tool?

+3

string regex shell

Chris hanning Apr 16 17 at 2:54

source to share

2 answers

Here's one way to do it with one line:

sed  's/\([^\w\s]\)\(_\)/\2\1/g' test.txt

You are essentially looking for two characters and replacing them.

s / - Runs replacement

\( \)

- It escapes the parentheses. Should do it even if its ugly.

\s

space character

[ ]

sets the character class

^

negates at the first position within a character class

[^\w\s]

all characters that are not letters or spaces (e.g. punctuation)

Then we move on to the next match, underscore. We do this as the second check point.

\(_\)

- First find punctuation and mark it as match with number 1, then find the underscore next to it and mark it as match with number 2.

/\2\1/

- Now swap matches 1 and 2

/g

- do it globally.

The end. Now you can output this to another file or use a different modifier sed

(switch -i

) to change the inline file.

+1

Nik Roby Apr 16 '17 at 3:27

source to share

mklement0 · Accepted Answer · 2017-04-16T02:59:46+0000

sed

supports backreferences to capture the groups defined in the call regex s

argument in the replace argument ( bash

here-string ( <<<

) syntax is used for brevity ):

$ sed -E 's/([.?!;:])_/_\1/g' <<<'On this _line,_ I show an example. !_'
On this _line,_ I show an example. _!

\1

refers to the first capture group ( (...)

) in the regex.

Note that it was -E

used to support extended regular expressions that use modern syntax - both GNU sed

and BSD / macOS support sed

.

Generally, you don't need the option sed

-E

unless you are passing the sed

script in multiple parts, in which case each part must be -E

-prefixed.

Regarding an in-place update of the original file:

-ie

probably doesn't do (exactly) what you want: while it updates the input file (replacing it with a new file with updated content), it creates a suffixed backup file e

because it is e

interpreted as an option by the option -i

argument.

Unless the goal is to create a backup file, the syntax - sadly - differs depending on which implementation sed

you're using:

GNU sed

:sed -i ...
- -i
  
  must not be followed by any other parameters / characters.
BSD / macOS sed

:sed -i '' ...
- -i
  
  should follow ''
  
  as the next, separate argument.

Find all matches of a two-character string in a text file and swap them

More articles: