How to make freq counter more effective?
I wrote this F # code to count the frequencies of words in a list and return a tuple in C #. Could you please tell me how to make the code more efficient or shorter?
let rec internal countword2 (tail : string list) wrd ((last : string list), count) = match tail with |  -> last, wrd, count | h::t -> countword2 t wrd (if h = wrd then last, count+1 else last @ [h], count) let internal countword1 (str : string list) wrd = let temp, wrd, count = countword2 str wrd (, 0) in temp, wrd, count let rec public countword (str : string list) = match str with |  ->  | h::_ -> let temp, wrd, count = countword1 str h in [(wrd, count)] @ countword temp
source to share
Your solution is repeated through the input list several times, for each new word it encounters. Instead, you can iterate over the list just once and create a dictionary that contains the count of all occurrences for each word.
To do this, in functional style, you can use F #
, which is an immutable dictionary:
let countWords words = // Increment the number of occurrences of 'word' in the map 'counts' // If it isn't already in the dictionary, add it with count 1 let increment counts word = match Map.tryFind word counts with | Some count -> Map.add word (count + 1) counts | _ -> Map.add word 1 counts // Start with an empty map and call 'increment' // to add all words to the dictionary words |> List.fold increment Map.empty
You can also implement the same thing in an imperative style that is more efficient but less elegant (and you won't get all the benefits of a functional style). However, the standard mutable
can be used well from F # too (it will be similar to the C # version, so I won't write it here).
Finally, if you want a simple solution using only standard F # functions, you can use
as suggested by the pad. It will probably be almost as efficient as the .based version
. But then, if you're just learning F #, then writing a few recursive functions like
this is a great way to learn!
To give you some comments about your code, the complexity of your approach is slightly higher, but that should probably be fine. However, there are some common problems:
In your function
, you have
if h = wrd then ... else last @ [h], count
. The call is
last @ [h]
inefficient as it has to clone the entire list
. Instead, you can just write
to add the word to the beginning, because the order doesn't matter.
On the last line, you use again
[(wrd, count)] @ countword temp
. It's not obligatory. If you add one element to the top of the list, you should use:
source to share