Split row header

I want to reformat the lines below. See Example input and desired output. I have been messing with awk without finding the right solution.

Input:

>1-672762
TGAGGTAGTAGGTTGTATGGTT
>2-240457
TGAGGTAGTAGGTTGTGTGGTT
>3-130231
TAGCAGCACGTAAATATTGGCG
>4-116485
TGAGGTAGTAGGTTGTATAGTT

      

Output (should be split into a tab):

TGAGGTAGTAGGTTGTATGGTT  672762
TGAGGTAGTAGGTTGTGTGGTT  240457
TAGCAGCACGTAAATATTGGCG  130231
TGAGGTAGTAGGTTGTATAGTT  116485

      

+3


source to share


4 answers


FROM :

$ perl -lne '/^>\d+-(\d+)/ or print "$_\t$1"' file

      



Output:

TGAGGTAGTAGGTTGTATGGTT    672762
TGAGGTAGTAGGTTGTGTGGTT    240457
TAGCAGCACGTAAATATTGGCG    130231
TGAGGTAGTAGGTTGTATAGTT    116485

      

+8


source


Another approach in perl ("-" - chr (055)):

perl -wln055e's/(\S+)\s+(\S+).*/$2\t$1/s and print'

      



or

perl -wlp055e'BEGIN{<>}s/(\S+)\s+(\S+).*/$2\t$1/s'

      

+6


source


$ awk -F- '/>/{x=$2;next} {print $0 "\t" x}' file
TGAGGTAGTAGGTTGTATGGTT  672762
TGAGGTAGTAGGTTGTGTGGTT  240457
TAGCAGCACGTAAATATTGGCG  130231
TGAGGTAGTAGGTTGTATAGTT  116485

      

+3


source


This might work for you (GNU sed):

sed -r 'N;s/^[^-]*-(.*)\n(.*)/\2\t\1/' file

      

+1


source







All Articles