Sort file by length of characters in first column / line
I need to sort a file based on the number of characters in the first column.
I have no idea how to do this. I am using Linux so sed / awk / sort are available.
.abs is bla bla 12 .abc is bla se 23 bla .fe is bla bla bla .jpg is pic extension .se is for swedish domains
I want to sort these rows based on the length of the first column in each row. Some lines start with 4 characters, some start with 3 or 2. I want the result to be something like this:
.fe is bla bla bla .se is for swedish domains .abs is bla bla 12 .abc is bla se 23 bla .jpg is pic extension
Is it possible?
Increase each line by the length of the first word, then sort:
awk '{ print length($1) " " $0; }' $FILE | sort -n
If necessary, cut out the auxiliary field with cut -d ' ' -f 2-
.
You can also do this with coreutils, albeit rather inefficiently:
paste -d' ' <(cut -d' ' -f1 infile | xargs -l sh -c 'echo "$1" | wc -c' '{}') infile |
sort -n | cut -d' ' -f2-
Or with the parallel GNU, if available:
paste -d' ' <(cut -d' ' -f1 infile | parallel wc -c '<<< {}') infile |
sort -n | cut -d' ' -f2-
Or with bash:
<infile while read c1 rest; do echo ${#c1} "$c1" "$rest"; done |
sort -n | cut -d' ' -f2-
or you can also use sed after that,
awk '{print length($1)" "$0}' temp.txt | sort -k 1,2| sed -re 's/^[0-9]+ //'