Using the tr /// operator to count letters in a string

I would like to count the number A, C and G in a sequence or string. I wrote the following code.

But when I print the values, only A is printed. C and G are displayed as zero. In the code below, I evaluate A first, but if I switch the order first, evaluating C first, I get C values, but now A and G are printed out as zero.

Can anyone tell me what is wrong with my code? Thank!

#! /usr/bin/perl

use strict;
use warnings;

open(IN, "200BP_junctions_fasta.faa") or die "Cannot open the file: $!\n";
while(<IN>) 
    next if $_ =~ /\>/;
    my $a = ($_ = tr/A//);
    my $c = ($_ = tr/C//);
    my $g = ($_ = tr/G//);
    print "A:$a, C:$c, G:$g\n";
}

      

The file looks like this:

> A_Seq  
ATGCTAGCTAGCTAGCTAGTC  
> B_Seq  
ATGCGATCGATCGATCGATAG  

      

+3


source to share


4 answers


Change $_ = tr/

to $_ =~ tr/

. Also, you are missing an open curly brace for while

.



+6


source


Because it '5'

doesn't have 'C'

or 'G'

. You assign the translation value $_

to $_

. If you bind the ( $_ =~ tr//

) operation to $_

, you get the desired result.

But you don't really need to bind to the context variable. Binding is where you can apply a regular expression or translate an operation to another variable. You'd better write:

my $a = tr/A//;
my $c = tr/C//;
my $g = tr/G//;

      



But you can do this too:

$_{$_}++ foreach m/[ACG]/g;
say "A:$_{A}, C:$_{C}, G:$_{G}";

      

+1


source


The answer is that you need the bind operator =~

instead of the operat0r assignment, =

or you don't need to bind the default variable.

Lately I've been using printf

for things like this:

while( <DATA> ) {
    next if /\>/;
    printf "A:%s C:%s G:%s\n", tr/A//, tr/C//, tr/G//;
    }

      

I often wanted to tr///

be able to interpolate so that I could write this, which doesn't work:

while( my $line = <DATA> ) {
    next if $line =~ /\>/;
    print "Line is $_\n";
    printf "A:%s C:%s G:%s\n", map { $line =~ tr/$_// } qw(A C G);
    }

      

Note that I would have unnecessary annoyance on the counter $_

if I used the default variable in while

. I know I can do it eval

, but not only is it more of a hassle, but l4m3:

while( my $line = <DATA> ) {
    next if $line =~ /\>/;
    print "Line is $_\n";
    printf "A:%s C:%s G:%s\n", map { eval "\$line =~ tr/$_//" } qw(A C G);
    }

      

I didn't need to know the implementation details, so I could move this into a sub until I can figure out how to get rid of eval

, although additional subroutine calls can slow down a lot of data processing:

while( my $line = <DATA> ) {
    next if $line =~ /\>/;
    print "Line is $line\n";
    printf "A:%s C:%s G:%s\n", map { count_bases( $line, $_ ) } qw(A C G);
    }

sub count_bases { eval "\$_[0] =~ tr/$_[1]//" }

      

Probably some clever way to do XOR strings if you don't like it tr///

, but I've never chased it long enough to figure it out (not that it's better than what you're already doing).

+1


source


open(IN, "input") or die "Cannot open the file: $!\n";
while(<IN>) {
  next if $_ =~ /\>/;
  my $a = @{[m/(A)/g]};
  my $c = @{[m/(C)/g]};
  my $g = @{[m/(D)/g]};
  print "A:$a, C:$c, G:$g\n";
}

      

0


source







All Articles