Bash: changing column fields by current number of input / run numbers

I'm stuck with a pretty simple task (which is even more frustrating ;-)): I have a column like this:

>foo111_bar37
>foo111_bar38
>foo111_bar40
>foo111_bar40
>foo111_bar41
>foo111_bar42
>foo111_bar49
>foo111_bar49
>foo111_bar49
...

      

and I would like to either change this column or get a new column that includes the current count of the same row

>foo111_bar37x1
>foo111_bar38x1
>foo111_bar40x1
>foo111_bar40x2
>foo111_bar41x1
>foo111_bar42x1
>foo111_bar49x1
>foo111_bar49x2
>foo111_bar49x3
...

      

purpose

is that the line becomes unique and contains the original information. I learned how to access a column with awk and change lines in general (like always adding "x1"), but not how to do it with number-dependent changes. Most people seem to want to get rid of their duplicates or count the total number of duplicates, that doesn't help me here.

BTW: I am using MobaXterm bash environment for Windows

Thank you so much!

+3


source to share


2 answers


Using awk you have this:

$ awk '{a[$1]++;print $1 "x" a[$1]}' file
>foo111_bar37x1
>foo111_bar38x1
>foo111_bar40x1
>foo111_bar40x2
>foo111_bar41x1
>foo111_bar42x1
>foo111_bar49x1
>foo111_bar49x2
>foo111_bar49x3

      



Clarifications:

$ awk ' {
   a[$1]++             # store to hash a using first field as key. ++ increases
                       # its value by 1 on each iteration for each $1
   print $1 "x" a[$1]  # output $1, "x" and current value of a[$1]
}' file

      

+2


source


A slightly shorter solution (preserving the concept) than Sir James Brown's excellent answer.

awk '{print $0"x"++array[$0]}'  Input_file

      



Explanation: So the print keyword will print lines in awk, so here I am printing the current line at $ 0, then I print line x, and then I print an array called array whose index is only $ 0, + + array [$ 0] means that it will first increment the value of this array and then print it out.

Let's say (foo111_bar40) came once, so it will have an index inside the array and that value will be 1, so the next time array will see that this index is already present in the array, so it just increments it by 1 and then prints.

+1


source







All Articles