How to make freq counter more effective?

I wrote this F # code to count the frequencies of words in a list and return a tuple in C #. Could you please tell me how to make the code more efficient or shorter?

let rec internal countword2 (tail : string list) wrd ((last : string list), count) =
match tail with
| [] -> last, wrd, count
| h::t -> countword2 t wrd (if h = wrd then last, count+1 else last @ [h], count)

let internal countword1 (str : string list) wrd =
let temp, wrd, count = countword2 str wrd ([], 0) in
temp, wrd, count

let rec public countword (str : string list) =
match str with
| [] -> []
| h::_ ->
  let temp, wrd, count = countword1 str h in
       [(wrd, count)] @ countword temp



source to share

3 answers

If you want to count the frequencies of words in a string list, your approach seems to be overkill. Seq.groupBy

works well for this purpose:

let public countWords (words: string list) = 
   words |> Seq.groupBy id
         |> (fun (word, sq) -> word, Seq.length sq)
         |> Seq.toList




Even the pad version can be more efficient and concise:

let countWords = Seq.countBy id



countWords ["a"; "a"; "b"; "c"] //returns: seq [("a", 2); ("b", 1); ("c", 1)]




Your solution is repeated through the input list several times, for each new word it encounters. Instead, you can iterate over the list just once and create a dictionary that contains the count of all occurrences for each word.

To do this, in functional style, you can use F # Map

, which is an immutable dictionary:

let countWords words = 
  // Increment the number of occurrences of 'word' in the map 'counts'
  // If it isn't already in the dictionary, add it with count 1
  let increment counts word =
    match Map.tryFind word counts with
    | Some count -> Map.add word (count + 1) counts
    | _ -> Map.add word 1 counts

  // Start with an empty map and call 'increment' 
  // to add all words to the dictionary
  words |> List.fold increment Map.empty


You can also implement the same thing in an imperative style that is more efficient but less elegant (and you won't get all the benefits of a functional style). However, the standard mutable Dictionary

can be used well from F # too (it will be similar to the C # version, so I won't write it here).

Finally, if you want a simple solution using only standard F # functions, you can use Seq.groupBy

as suggested by the pad. It will probably be almost as efficient as the .based version Dictionary

. But then, if you're just learning F #, then writing a few recursive functions like countWords

this is a great way to learn!

To give you some comments about your code, the complexity of your approach is slightly higher, but that should probably be fine. However, there are some common problems:

  • In your function countword2

    , you have if h = wrd then ... else last @ [h], count

    . The call is last @ [h]

    inefficient as it has to clone the entire list last

    . Instead, you can just write h::last

    to add the word to the beginning, because the order doesn't matter.

  • On the last line, you use again @

    in [(wrd, count)] @ countword temp

    . It's not obligatory. If you add one element to the top of the list, you should use: (wrd,count)::(countword temp)




All Articles