Find all matches of a two-character string in a text file and swap them
Searching a text file for an underscore preceded by a punctuation mark --- [.?!;:]_
--- I want to change the order.
For example, given the line
On this _line,_ I show an example. !_
I want to change it to:
On this _line_, I show an example. _!
I can find all cases, say Silver Searcher or ripgrep:
rg '[.?!;:]_' myfile.txt
but I'm not sure how those two characters are then replaced and written to the location or to a new file.
I could just use sed
for each instance of punctuation, for example:
sed -ie 's/,_/_,/g' myfile.txt
then
sed -ie 's/\._/_\./g' myfile.txt
then ...
but it would be nice to accomplish this with a single command.
Is it possible to reference the found instance and use it in the ripgrep option -r ARG
? Or am I barking the wrong tree and wise to use another tool?
source to share
sed
supports backreferences to capture the groups defined in the call regex s
argument in the replace argument ( bash
here-string ( <<<
) syntax is used for brevity ):
$ sed -E 's/([.?!;:])_/_\1/g' <<<'On this _line,_ I show an example. !_'
On this _line,_ I show an example. _!
\1
refers to the first capture group ( (...)
) in the regex.
Note that it was -E
used to support extended regular expressions that use modern syntax - both GNU sed
and BSD / macOS support sed
.
Generally, you don't need the option sed
-E
unless you are passing the sed
script in multiple parts, in which case each part must be -E
-prefixed.
Regarding an in-place update of the original file:
-ie
probably doesn't do (exactly) what you want: while it updates the input file (replacing it with a new file with updated content), it creates a suffixed backup file e
because it is e
interpreted as an option by the option -i
argument.
Unless the goal is to create a backup file, the syntax - sadly - differs depending on which implementation sed
you're using:
-
GNU
sed
:sed -i ...
-
-i
must not be followed by any other parameters / characters.
-
-
BSD / macOS
sed
:sed -i '' ...
-
-i
should follow''
as the next, separate argument.
-
source to share
Here's one way to do it with one line:
sed 's/\([^\w\s]\)\(_\)/\2\1/g' test.txt
You are essentially looking for two characters and replacing them.
s / - Runs replacement
\( \)
- It escapes the parentheses. Should do it even if its ugly.
\s
space character
[ ]
sets the character class
^
negates at the first position within a character class
[^\w\s]
all characters that are not letters or spaces (e.g. punctuation)
Then we move on to the next match, underscore. We do this as the second check point.
\(_\)
- First find punctuation and mark it as match with number 1, then find the underscore next to it and mark it as match with number 2.
/\2\1/
- Now swap matches 1 and 2
/g
- do it globally.
The end. Now you can output this to another file or use a different modifier sed
(switch -i
) to change the inline file.
source to share