Sort columns from BASH file

I have the following shell script that reads data from a file entered at the command line. The file is a matrix of numbers and I need to separate the file by columns and then sort the columns. Right now I can read the file and output the individual columns, but I am at a loss as to how to sort. I entered a sort operator, but it only sorts the first column.

EDIT: I decided to take another route and actually transpose the matrix in order to turn the columns into rows. Since I have to calculate the mean and mean later and have already successfully done this for the file in different ways earlier in the script - I was suggested to try and "spin" the matrix if you want to turn columns into rows.

Here is my UPDATED code

     declare -a col=( )
     read -a line < "$1"
     numCols=${#line[@]}                          # save number of columns

     index=0
     while read -a line ; do
     for (( colCount=0; colCount<${#line[@]}; colCount++ )); do
      col[$index]=${line[$colCount]}
      ((index++))
     done
     done < "$1"

     for (( width = 0; width < numCols; width++ )); do
      for (( colCount = width; colCount < ${#col[@]}; colCount += numCols )    ); do

       printf "%s\t" ${col[$colCount]}
     done
    printf "\n"
   done

      

This gives me the following output:

    1 9 6 3 3 6
    1 3 7 6 4 4
    1 4 8 8 2 4
    1 5 9 9 1 7
    1 5 7 1 4 7

      

Although I'm currently looking for:

    1 3 3 6 6 9
    1 3 4 4 6 7
    1 2 4 4 8 8
    1 1 5 7 9 9
    1 1 4 5 7 7

      

To try and sort the data, I've tried the following:

    sortCol=${col[$colCount]}
    eval col[$colCount]='($(sort <<<"${'$sortCol'[*]}"))'

      

Also: (how I sorted the string after reading from the string)

    sortCol=( $(printf '%s\t' "${col[$colCount]}" | sort -n) )

      

If you could provide any insight on this it would be appreciated!

+3


source to share


4 answers


Please note that, as mentioned in the comments, a pure bash solution is not very pretty. There are several ways to do this, but this is probably the most direct one. The following requires reading all the values ​​in the row into an array and storing the matrix stride

so that it can be transposed to read all the column values ​​into the row matrix and sort. All sorted columns are inserted into a new matrix of rows a2

. Transposing this matrix of rows returns the original matrix in the sort order of the columns.

Note , this will work for any rank of the column matrix in your file.

#!/bin/bash

test -z "$1" && {           ## validate number of input
    printf "insufficient input. usage:  %s <filename>\n" "${0//*\//}"
    exit 1;
}

test -r "$1" || {           ## validate file was readable
    printf "error: file not readable '%s'. usage:  %s <filename>\n" "$1" "${0//*\//}"
    exit 1;
}

## function: my sort integer array - accepts array and returns sorted array
## Usage: array=( "$(msia ${array[@]})" )
msia() {
    local a=( "$@" )
    local sz=${#a[@]}
    local _tmp
    [[ $sz -lt 2 ]] && { echo "Warning: array not passed to fxn 'msia'"; return 1; }
    for((i=0;i<$sz;i++)); do
        for((j=$((sz-1));j>i;j--)); do
        [[ ${a[$i]} -gt ${a[$j]} ]] && {
            _tmp=${a[$i]}
            a[$i]=${a[$j]}
            a[$j]=$_tmp
        }
        done
    done
    echo ${a[@]}
    unset _tmp
    unset sz
    return 0
}

declare -a a1               ## declare arrays and matrix variables
declare -a a2
declare -i cnt=0
declare -i stride=0
declare -i sz=0

while read line; do         ## read all lines into array
    a1+=( $line );
    (( cnt == 0 )) && stride=${#a1[@]}  ## calculate matrix stride
    (( cnt++ ))
done < "$1"

sz=${#a1[@]}                ## calculate matrix size
                            ## print original array
printf "\noriginal array:\n\n"
for ((i = 0; i < sz; i += stride)); do
    for ((j = 0; j < stride; j++)); do
        printf " %s" ${a1[i+j]}
    done
    printf "\n"
done

                            ## sort columns from stride array
for ((j = 0; j < stride; j++)); do
    for ((i = 0; i < sz; i += stride)); do
        arow+=( ${a1[i+j]} )
    done
    a2+=( $(msia ${arow[@]}) )  ## create sorted array
    unset arow
done
                            ## print the sorted array
printf "\nsorted array:\n\n"
for ((j = 0; j < cnt; j++)); do
    for ((i = 0; i < sz; i += cnt)); do
        printf " %s" ${a2[i+j]}
    done
    printf "\n"
done

exit 0

      



Output

$ bash sort_cols2.sh dat/matrix.txt

original array:

 1 1 1 1 1
 9 3 4 5 5
 6 7 8 9 7
 3 6 8 9 1
 3 4 2 1 4
 6 4 4 7 7

sorted array:

 1 1 1 1 1
 3 3 2 1 1
 3 4 4 5 4
 6 4 4 7 5
 6 6 8 9 7
 9 7 8 9 7

      

+1


source


Awk script

awk '
{for(i=1;i<=NF;i++)a[i]=a[i]" "$i}      #Add to column array
END{
        for(i=1;i<=NF;i++){
                split(a[i],b)          #Split column
                x=asort(b)             #sort column
                for(j=1;j<=x;j++){     #loop through sort
                        d[j]=d[j](d[j]~/./?" ":"")b[j]  #Recreate lines
                }
        }
for(i=1;i<=NR;i++)print d[i]          #Print lines
}' file

      



Output

1 1 1 1 1
3 3 2 1 1
3 4 4 5 4
6 4 4 7 5
6 6 8 9 7
9 7 8 9 7

      

0


source


Here's my entry for this little exercise. An arbitrary number of columns should be processed. I assume they are separated by spaces:

#!/bin/bash

linenumber=0
while read line; do
        i=0
        # Create an array for each column.
        for number in $line; do
                [ $linenumber == 0 ] && eval "array$i=()"
                eval "array$i+=($number)"
                (( i++ ))
        done    
        (( linenumber++ ))
done <$1
IFS=$'\n'
# Sort each column
for j in $(seq 0 $i ); do
        thisarray=array$j
        eval array$j='($(sort <<<"${'$thisarray'[*]}"))'
done    
# Print each array 0'th entry, then 1, then 2, etc...
for k in $(seq 0 ${#array0[@]}); do
        for j in $(seq 0 $i ); do
                eval 'printf ${array'$j'['$k']}" "'
        done    
        echo "" 
done

      

0


source


Not bash

, but I think this code is python

worth looking at how this task can be achieved with built-in functions.

From interpreter

:

$ cat matrix.txt 
1 1 1 1 1
9 3 4 5 5
6 7 8 9 7
3 6 8 9 1
3 4 2 1 4
6 4 4 7 7

$ python
Python 2.7.3 (default, Jun 19 2012, 17:11:17) 
[GCC 4.4.3] on hp-ux11
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> f = open('./matrix.txt')
>>> for row in zip(*[sorted(list(a)) 
               for a in zip(*[a.split() for a in f.readlines()])]):
...    print ' '.join(row)
... 
1 1 1 1 1
3 3 2 1 1
3 4 4 5 4
6 4 4 7 5
6 6 8 9 7
9 7 8 9 7

      

0


source







All Articles