Unix one-liner to swap / wrap two lines in multiple text files?
I want to swap or transpose pairs of lines according to their line numbers (for example, switching the positions of lines 10 and 15) in multiple text files using a UNIX tool like sed or awk.
For example, I believe this sed command should swap lines 14 and 26 in the same file:
sed -n '14p' infile_name > outfile_name
sed -n '26p' infile_name >> outfile_name
How can this be extended to work with multiple files? Any single liner solutions are appreciated.
source to share
If you want to edit the file, you can use the standard one ed
. Your task is pretty simple in ed
:
printf '%s\n' 14m26 26-m14- w q | ed -s file
How it works?
-
14m26
saysed
to take line # 14 and move it after line # 26 -
26-m14-
saysed
to take the line before line # 26 (which is your original line # 26) and move it after the line before line # 14 (where exactly your line # 14 was) -
w
tellsed
to write file -
q
reportsed
exit.
If your numbers are in a variable, you can do:
linea=14
lineb=26
{
printf '%dm%d\n' "$linea" "$lineb"
printf '%d-m%d-\n' "$lineb" "$linea"
printf '%s\n' w q
} | ed -s file
or something similar. Make sure that linea<lineb
.
source to share
-
If you want reliable in-place update of your input files use gniourf_gniourf's excellent answer
ed
-
If you have GNU
sed
and want to update a location with multiple files at once , use @potong's excellent GNU answer (see below for a portable alternative, and below for an explanation)sed
Note. does update an existing file , whereas the option creates a temporary file behind the scenes, which then replaces the original file while usually this is not a problem, this one can have unwanted side effects , most notably replacing the symlink to a regular file (on the contrary, file permissions are saved correctly). ed
sed
-i
Below are POSIX compliant shell functions that will wrap both answers .
Stdin / stdout handling based on @potong's excellent answer :
- POSIX
sed
does not support-i
for in-place upgrade. - It also does not support use
\n
inside a character class, so[^\n]
it needs to be replaced with a cumbersome workaround that positively defines all but the characters\n
that might appear in the string - this is achieved with a character class combining printable characters with all (ASCII) control characters other than from\n
included as literals (using command substitution usingprintf
). - Also note the need to split the
sed
script into two options-e
, as POSIXsed
requires that the branch command (b
in this case) be aborted either by an actual new line or continued in a separate option-e
.
# SYNOPSIS
# swapLines lineNum1 lineNum2
swapLines() {
[ "$1" -ge 1 ] || { printf "ARGUMENT ERROR: Line numbers must be decimal integers >= 1.\n" >&2; return 2; }
[ "$1" -le "$2" ] || { printf "ARGUMENT ERROR: The first line number ($1) must be <= the second ($2).\n" >&2; return 2; }
sed -e "$1"','"$2"'!b' -e ''"$1"'h;'"$1"'!H;'"$2"'!d;x;s/^\([[:print:]'"$(printf '\001\002\003\004\005\006\007\010\011\013\014\015\016\017\020\021\022\023\024\025\026\027\030\031\032\033\034\035\036\037\177')"']*\)\(.*\n\)\(.*\)/\3\2\1/'
}
Example:
$ printf 'line 1\nline 2\nline 3\n' | swapLines 1 3
line 3
line 2
line 1
An in-place update based on gniourf_gniourf's excellent answer :
Small caveats:
- While
ed
this is a POSIX utility , it is not installed on all platforms, especially Debian and Cygwin. MSYS Unix emulation for Windows. -
ed
always reads the entire input file into memory.
# SYNOPSIS
# swapFileLines lineNum1 lineNum2 file
swapFileLines() {
[ "$1" -ge 1 ] || { printf "ARGUMENT ERROR: Line numbers must be decimal integers >= 1.\n" >&2; return 2; }
[ "$1" -le "$2" ] || { printf "ARGUMENT ERROR: The first line number ($1) must be <= the second ($2).\n" >&2; return 2; }
ed -s "$3" <<EOF
H
$1m$2
$2-m$1-
w
EOF
}
Example:
$ printf 'line 1\nline 2\nline 3\n' > file
$ swapFileLines 1 3 file
$ cat file
line 3
line 2
line 1
Description of @potong GNU explanationsed
:
His command changes lines 10 and 15:
sed -ri '10,15!b;10h;10!H;15!d;x;s/^([^\n]*)(.*\n)(.*)/\3\2\1/' f1 f2 fn
-
-r
activates support for extended regular expressions; here, in particular, it allows unescaped parentheses to be used to form capturing groups. -
-i
It indicates that the file specified as an operand (f1
,f2
,fn
), updated without any changes, as an additional suffix of the file for the backup file is not associated with-i
. -
10,15!b
means that all lines that do not have a (!
) fall within the range of lines10
through15
must enter (b
) implicitly at the end of the script (given that the name of the target label is not followedb
), which means that the following commands are skipped for those lines. In fact, they are simply printed as they are. -
10h
copies (h
) the line number10
(start of the range) into the so-called hold space, which is an auxiliary buffer. -
10!H
appends (h
) every line that is not a line10
- which in this case implies lines11
through15
- into the retention space. -
15!d
removes (d
) every line that is not a line15
(here, lines10
through14
) and continues to the end of the script (skips other commands). By deleting these lines, they are not printed. -
x
which is only executed for the line15
(end of range), replaces the so-called pattern space with the content of the hold space, which at this point holds all the lines in the range (10
through15
); a template space is a buffer in which commands runsed
and whose contents are printed by default (if not specified-n
). -
s/^([^\n]*)(.*\n)(.*)/\3\2\1/
then uses capturing groups (the parenthesized regex subexpression that forms the first argument passed to the functions
) to split the contents of the pattern space into the 1st line (^([^\n]*)
), middle lines ((.*\n)
), and the last line ((.*)
), and then in the replacement line (the second argument is passed to the functions
) uses backreferences to position the last line (\3
) to the middle (\2
), and then the first line (\1
), effectively swapping the first and last lines in the range. Finally, the modified drawing space is printed.
As you can see, only the range of lines spanning the two swap lines is stored in memory, while all other lines are transferred individually, making this approach memory efficient.
source to share
Using the following helper script allows power find ... -exec ./script '{}' l1 l2 \;
to find target files and swap lines l1
and l2
in each file in place. (this requires that there are no identical duplicate lines in the file that fall within the search range). The script uses sed
to read two swap lines from each file into an indexed array and passes the lines before sed
to complete the swap by matching. The call sed
uses the "negotiated first address" to restrict the second expression exchange to the first occurrence. An example of using the helper script below to swap lines 5
and 15
across all matching files:
find . -maxdepth 1 -type f -name "lnum*" -exec ../swaplines.sh '{}' 5 15 \;
For example, calling the find
above found files lnumorig.txt
and lnumfile.txt
in the current directory initially containing:
$ head -n20 lnumfile.txt.bak
1 A simple line of test in a text file.
2 A simple line of test in a text file.
3 A simple line of test in a text file.
4 A simple line of test in a text file.
5 A simple line of test in a text file.
6 A simple line of test in a text file.
<snip>
14 A simple line of test in a text file.
15 A simple line of test in a text file.
16 A simple line of test in a text file.
17 A simple line of test in a text file.
18 A simple line of test in a text file.
19 A simple line of test in a text file.
20 A simple line of test in a text file.
And I swapped the lines 5
and 15
by purpose:
$ head -n20 lnumfile.txt
1 A simple line of test in a text file.
2 A simple line of test in a text file.
3 A simple line of test in a text file.
4 A simple line of test in a text file.
15 A simple line of test in a text file.
6 A simple line of test in a text file.
<snip>
14 A simple line of test in a text file.
5 A simple line of test in a text file.
16 A simple line of test in a text file.
17 A simple line of test in a text file.
18 A simple line of test in a text file.
19 A simple line of test in a text file.
20 A simple line of test in a text file.
The helper script itself:
#!/bin/bash
[ -z $1 ] && { # validate requierd input (defaults set below)
printf "error: insufficient input calling '%s'. usage: file [line1 line2]\n" "${0//*\//}" 1>&2
exit 1
}
l1=${2:-10} # default/initialize line numbers to swap
l2=${3:-15}
while IFS=$'\n' read -r line; do # read lines to swap into indexed array
a+=( "$line" );
done <<<"$(sed -n $((l1))p "$1" && sed -n $((l2))p "$1")"
((${#a[@]} < 2)) && { # validate 2 lines read
printf "error: requested lines '%d & %d' not found in file '%s'\n" $l1 $l2 "$1"
exit 1
}
# swap lines in place with sed (remove .bak for no backups)
sed -i.bak -e "s/${a[1]}/${a[0]}/" -e "0,/${a[0]}/s/${a[0]}/${a[1]}/" "$1"
exit 0
Even though I didn't manage to do all of this in one liner , I decided it was worth posting it if you can use it or take ideas from it. Note: if you are using it, please check your satisfaction before disabling it on your system. The script is currently used sed -i.bak ...
to create backup copies of files modified for testing purposes. You can delete .bak
when satisfied, which suits your needs.
If you are not using to set the default swap lines in the helper script itself, I would change the first check of the check to [ -z $1 -o -z $2 -o $3 ]
to provide all the required arguments if the script.
While it identifies the lines to be replaced with a number , it relies on a direct match of each line to perform the swap. This means that any identical duplicate lines to the end of the swap range will cause an unintentional match and will not replace the intended lines. This is part of the limitation imposed by not keeping every line in the range of lines to be exchanged, as described in the comments. This is a compromise. There are many, many ways to approach this, each with their own advantages and disadvantages. Let me know if you have any questions.
Brute force method
In your comment, I revisited the script helper to use the brute-fourth copy / swap method, which will fix the problem with any duplicate lines in the search range. This helper gets lines through sed
as in the original, but then reads all lines from file
to tmpfile
, replacing the corresponding numbered lines when found. After filling, tmpfile
it is copied to the original file
and tmpfile
deleted.
#!/bin/bash
[ -z $1 ] && { # validate requierd input (defaults set below)
printf "error: insufficient input calling '%s'. usage: file [line1 line2]\n" "${0//*\//}" 1>&2
exit 1
}
l1=${2:-10} # default/initialize line numbers to swap
l2=${3:-15}
while IFS=$'\n' read -r line; do # read lines to swap into indexed array
a+=( "$line" );
done <<<"$(sed -n $((l1))p "$1" && sed -n $((l2))p "$1")"
((${#a[@]} < 2)) && { # validate 2 lines read
printf "error: requested lines '%d & %d' not found in file '%s'\n" $l1 $l2 "$1"
exit 1
}
# create tmpfile, set trap, truncate
fn="$1"
rmtemp () { cp "$tmpfn" "$fn"; rm -f "$tmpfn"; }
trap rmtemp SIGTERM SIGINT EXIT
declare -i n=1
tmpfn="$(mktemp swap_XXX)"
:> "$tmpfn"
# swap lines in place with a tmpfile
while IFS=$'\n' read -r line; do
if ((n == l1)); then
printf "%s\n" "${a[1]}" >> "$tmpfn"
elif ((n == l2)); then
printf "%s\n" "${a[0]}" >> "$tmpfn"
else
printf "%s\n" "$line" >> "$tmpfn"
fi
((n++))
done < "$fn"
exit 0
source to share
If the line numbers to be swapped have been corrected, you can try something like the command sed
in the following example to have the lines swapped with multiple files in place:
#!/bin/bash
# prep test files
for f in a b c ; do
( for i in {1..30} ; do echo $f$i ; done ) > /tmp/$f
done
sed -i -s -e '14 {h;d}' -e '15 {N;N;N;N;N;N;N;N;N;N;G;x;d}' -e '26 G' /tmp/{a,b,c}
# -i: inplace editing
# -s: treat each input file separately
# 14 {h;d} # first swap line: hold ; suppress
# 15 {N;N;...;G;x;d} # lines between: collect, append held line; hold result; suppress
# 26 G # second swap line: append held lines (and output them all)
# dump test files
cat /tmp/{a,b,c}
(This is consistent with Ethan Reisner's comment.)
source to share
If you want to swap two lines, you can send them twice, you can do it in one sed
script if you want, but this works:
eg.
test.txt: for a in {1..10}; do echo "this is line $a"; done >> test.txt
this is line 1
this is line 2
this is line 3
this is line 4
this is line 5
this is line 6
this is line 7
this is line 8
this is line 9
this is line 10
Then to replace strings 6
and 9
:
sed ':a;6,8{6h;6!H;d;ba};9{p;x};' test.txt | sed '7{h;d};9{p;x}'
this is line 1
this is line 2
this is line 3
this is line 4
this is line 5
this is line 9
this is line 7
this is line 8
this is line 6
this is line 10
In the first, sed
it creates a hold space with lines 6 through 8. On line 9, it prints line 9, then prints the hold space (lines 6 through 8), this provides the first 9 step for placement 6. Note: 6h; 6!H
avoids a new line at the top parts of the template space.
The second movement happens in the second sed
script, it stores line 7 in hold space, then removes it and prints after line 9.
To make it quasi-general, you can use variables like this:
A=3 && B=7 && sed ':a;'${A}','$((${B}-1))'{'${A}'h;'${A}'!H;d;ba};'${B}'{p;x};' test.txt | sed $(($A+1))'{h;d};'${B}'{p;x}'
Where A
and B
are the lines you want to swap, in this case lines 3 and 7.