How to transpose in bash / awk / other

Question

How to transpose in bash / awk / other

How to transfer the following data

colA    colB    colC    colD
val1    val2    val3    val4
val5    val6    val7    val8

To make it look like this:

colA    val1    val5
colB    val2    val6
colC    val3    val7
colD    val4    val8

This data is delimited by tabs, but can also be separated by commas.

I can do this very easily in excel, but I am wondering how to do it in bash using awk or something?

EDIT1

I can make it work with spaces

$ cat testdata2_withspace.txt
colA colB colC colD
val1 val2 val3 val4
val5 val6 val7 val8

$ ./transpose3.sh testdata2_withspace.txt
colA val1 val5
colB val2 val6
colC val3 val7
colD val4 val8

but it doesn't do the same with comma delimited files

$ cat testdata2.txt
colA,colB,colC,colD
val1,val2,val3,val4
val5,val6,val7,val8

$ ./transpose3.sh testdata2.txt
colA,colB,colC,colD val1,val2,val3,val4 val5,val6,val7,val8

+3

bash awk transpose

HattrickNZ May 25 '15 at 21:01

source to share

3 answers

Ed morton · Answer 1 · 2015-05-25T21:33:47+0000

$ cat tst.awk
{ for (i=1; i<=NF; i++) cell[NR,i]=$i }
END {
    for (row=1; row<=NF; row++) {
        for (col=1; col<=NR; col++) {
            printf "%s%s", cell[col,row], (col<NR?OFS:ORS)
        }
    }
}

$ awk -f tst.awk file
colA val1 val5
colB val2 val6
colC val3 val7
colD val4 val8

To use commas instead of spaces, add BEGIN{ FS=OFS="," }

at the beginning as a separator .

The UNIX Man · Answer 2 · 2015-05-25T21:11:58+0000

There is an awk script here that might work:

#!/bin/bash 

transpose() 
{ 
  awk ' 
      BEGIN {
          FS = ",";
          OFS = ",";
      }

      { 
              if (max_nf<NF) 
                    max_nf=NF 
              max_nr=NR 
              for (x=1; x<=NF; ++x) 
                     vector[x, NR]=$x 
      } 

      END { 
              for (x=1; x<=max_nf; ++x) { 
                   for (y=1; y<=max_nr; ++y) 
                        printf("%s ", vector[x, y]) 
                   printf("\n") 
               } 
          }'  ${1} 
} 

transpose ${1}

gboffi · Answer 3 · 2015-05-25T21:39:02+0000

Since the title of the question contains "/ other" I like to post the Python + Bash answer.

The following is for a tab / space delimited file, minor adjustments needed for CSV

$ trans () { python -c 'print "\n".join("\t".join(t)for t in zip(*[l.split()for l in open("'"$1"'")]))' ; }
$ cat test2.csv
Roll_num Marks Grade
1        75    A
2        60    C
27       68    B
61       45    E
$ trans test2.csv 
Roll_num        1       2       27      61
Marks   75      60      68      45
Grade   A       C       B       E

The wrapper function is defined to be reliable with respect to the names "fun ny"

$ cp test2.csv fun\ ny
$ trans fun\ ny
Roll_num        1       2       27      61
Marks   75      60      68      45
Grade   A       C       B       E
$

Adding

Below are the minor adjustments needed for the CSV file

$ trans, () { python -c'print"\n".join(",".join(r)for r in zip(*[[i.strip()for i in l.strip().split(",")]for l in open("'"$1"'")]))'; }
$ cat test111.csv
Sales #, Date, Tel Number, Comment
393ED3, 5/12/2010, 5555551212, left message
585E54, 6/15/2014, 5555551213, voice mail
585868, 8/16/2010, , number is 5555551214
$ trans, test111.csv
Sales #,393ED3,585E54,585868
Date,5/12/2010,6/15/2014,8/16/2010
Tel Number,5555551212,5555551213,
Comment,left message,voice mail,number is 5555551214

For those who are not familiar with Python and do not know how it works,

$ python -c'string'

executes a Python statement 'string'

, which in our case (given that it fname

is an extension S1

)

 print "\n".join("\t".join(t)for t in zip(*[l.split()for l in open("fname")]))

Python strings have a method .join(iterable)

that acts on the iteration of other strings,

' r '.join(('a','b','c')) => 'a r b r c'

and here we have '\n

.join (...) that takes the elements in

... and joins them with a new line, but what we have in

... `?

We have a generator expression where for each iterable element t

returned by the inline zip

we attach the elements of the iterable using tabs

We can conclude that we are printing individual lines of tab-delimited elements, with those elements being elements t

, each line printed corresponding to an iterable t

returned zip

.

The next question is, what are the elements of each printed line? we must study

zip(*[l.split() for l in open("fname")])

internally, l

is a line in the original file, we get them iterated over the file object returned open("fname")

, we work on each line with a method .split()

that breaks into spaces by default, and returns a split list

elements

[l.split() for l in open("fname")]

is a so-called list comprehension, meaning that in this case the value of the expression is a list of lists, one list of separated elements for each line of the input file.

We are now left with the equivalent

lol = [l.split() for l in open("fname")]
zip(*lol)

zip

is a builtin that takes N

sequences as arguments and returns a sequence of sequences as a list of tuples, taking one element from each sequence in turn

zip((1,2,3),(4,5,6),(7,8,9)) => [(1,4,7),(2,5,8),(3,6,9)]

ultimately the prefix operator *

unpacks the list, so we can conclude that the elements of each printable line (elements t for t in zip(....)

) are the first elements from each line, the second elements from each line, etc. so we completed the task of printing the transposition of the input file.

How to transpose in bash / awk / other

Adding

More articles: