Why isn't my Perl example of the frequency of occurrences useful?

I am very new to Perl and I am trying to write a word frequency counter as a learning exercise.

However, I cannot figure out the error in my code below after working on it. This is my code:

$wa = "A word frequency counter."; 
@wordArray = split("",$wa);
$num = length($wa);
$word = "";
$flag = 1; # 0 if previous character was an alphabet and 1 if it was a blank.
%wordCount = ("null" => 0);
if ($num == -1) {
    print "There are no words.\n";
} else {
    print "$length";
    for $i (0 .. $num) {
        if(($wordArray[$i]!=' ') && ($flag==1)) { # start of a new word.
            print "here";
            $word = $wordArray[$i];
            $flag = 0;
        } elsif ($wordArray[$i]!=' ' && $flag==0) { # continuation of a word.
            $word = $word . $wordArray[$i];
        } elsif ($wordArray[$i]==' '&& $flag==0) { # end of a word.
            $word = $word . $wordArray[$i];
            $flag = 1;
            $wordCount{$word}++;
            print "\nword: $word";
        } elsif ($wordArray[$i]==" " && $flag==1) { # series of blanks.
            # do nothing.
        }
    }
    for $i (keys %wordCount) {
        print " \nword: $i - count: $wordCount{$i} ";
    }
}

      

It is not a seal of "here", not a word. At the moment, I don't care about optimization, although any input in this direction would be appreciated as well.

+3


source to share


3 answers


Disable,

$wordArray[$i]!=' '

      

it should be

$wordArray[$i] ne ' '

      

according to Perl documentation for string and character comparison. In general use numerical operators ( ==

, >=

, & hellip;) for numbers and operators to the text string ( eq

, ne

, lt

, & hellip;).



Alternatively, you could do

@wordArray = split(" ",$wa);

      

instead

@wordArray = split("",$wa);

      

and then @wordArray

you wouldn't have to do an awkward character check and you never have a problem. @wordArray

will split into words already and you just need to count the occurrences.

+1


source


This is a good example of a problem where Perl can help you figure out what's wrong if you just ask for help. Always get used to adding lines:

use strict;
use warnings;

      



at the top of your Perl programs.

+6


source


You seem to be writing C in Perl. The difference is not only in style. As a result of exploding a string into an array of individual characters, you can also explode the memory area of ​​your script.

Also, you need to think about what constitutes a word. Below I am not suggesting that either \w+

is a word, but rather indicating the difference between \S+

and \w+

.

#!/usr/bin/env perl

use strict; use warnings;
use YAML;

my $src = '$wa = "A word frequency counter.";';

print Dump count_words(\$src, 'w');
print Dump count_words(\$src, 'S');

sub count_words {
    my $src = shift;
    my $class = sprintf '\%s+', shift;
    my %counts;

    while ($$src =~ /(?<sequence> $class)/gx) {
        $counts{ $+{sequence} } += 1;
    }

    return \%counts;
}

      

Output:

---
A: 1
counter: 1
frequency: 1
wa: 1
word: 1
---
'"A': 1
$ wa: 1
=: 1
counter. ";: 1
frequency: 1
word: 1
+1


source







All Articles