Calculating averages from matrices
I have a set of matrices stored in text files. I would like to compute the output matrix obtained from the elementary means of the input matrices. Below is an illustration:
cat file1.txt
Item0 Item1
Item0 1.01456e+06 5
Item1 2 12.2
cat file2.txt
Item0 Item1
Item0 1.0274e+06 6
Item1 0 14.5
cat output.txt
Item0 Item1
Item0 1020980 5.5
Item1 1 13.35
Note that some of the values in the input matrices are in engineering notation
. All suggestions are welcome!
source to share
awk -v row=2:3 -v col=2:3 -v num=2 '
BEGIN {
split(row, r, ":")
split(col, c, ":")
n = num
}
r[1]<=FNR && FNR<=r[2] {
for(i=c[1];i<=c[2];i++)
{
m[FNR,i]+=$i
}
}
END {
for(i=r[1];i<=r[2];i++)
{
for(j=c[1];j<=c[2];j++)
{
printf("%f\t", m[i,j]/n)
}
print ""
}
}' file{1,2}.txt
1020980.000000 5.500000
1.000000 13.350000
source to share
I would suggest doing this in two steps. First, we transform the matrices into rows (number of rows, column number, value) three times. For simplicity, I'll take matrices without row and column labels.
for f in file*.txt
do
awk '{ for (n=1; n<=NF; n++) { print NR, n, $n } }' $f
done
This first step brings all matrices together in a way that is easier to process.
Then compute the averages by concatenating triples with awk:
awk -v Rows=2 -v Cols=2 Mats=2 '
{
sum[$1, $2] += $3
}
END {
for (m=1; m<=Rows; m++) {
for (n=1; n<=Cols; n++) {
printf("%s ", sum[m, n])
}
printf("\n")
}
}'
For the sake of simplicity, I've just passed the numbers of rows, columns, and matrices as awk variables. Instead, you can define those of the triplets.
source to share