The easiest way in R to get a vector of element frequencies in a vector

Question

The easiest way in R to get a vector of element frequencies in a vector

Suppose I have a vector of v values. The easiest way to get a vector f of length equal to v, where the i-th element of f is the frequency of the i-th element v in v?

The only way I know this seems to be unnecessarily complicated:

v = sample(1:10,100,replace=TRUE)
D = data.frame( idx=1:length(v), v=v )
E = merge( D, data.frame(table(v)) )
E = E[ with(E,order(idx)), ]
f = E$Freq

Surely there is an easier way to do this along the "frequency (v)" lines?

+3

r

baixiwei May 20 '15 at 14:12

source to share

3 answers

whuber · Answer 1 · 2015-05-20T15:20:29+0000

For a vector of small natural numbers v

, as in the question, the expression

tabulate(v)[v]

is particularly simple as well as fast.

For more general number vectors, v

you can convince to ecdf

help you, as in

w <- sapply(v, ecdf(v)) * length(v)
tabulate(w)[w]

It is most likely best to do the coding of the underlying algorithm yourself - and of course it avoids the floating point rounding error implicit in the previous solution:

frequencies <- function(x) {
  i <- order(x)
  v <- x[i]
  w <- cumsum(c(TRUE, v[-1] != v[-length(x)]))
  f <- tabulate(w)[w]
  return(f[order(i)])
}

This algorithm sorts the data, assigns sequential IDs 1, 2, 3, ... to values when it encounters them (by summing a binary indicator as values change) uses the previous trick tabulate()[]

to get the frequency efficiently, and then selects the results so that the output matches the input. component by component.

Vincent Guillemot · Answer 2 · 2015-05-20T14:25:55+0000

Something like this works for me:

sapply(v, function(elmt, vec) sum(vec == elmt), vec=v)

bgoldst · Answer 3 · 2015-05-20T16:45:55+0000

I think the best solution is here:

ave(v,v,FUN=length)

This is simply ave()

to replicate and map the return value FUN()

back to each index of the input Vector whose element was part of the group for which this particular call was made FUN()

.

The easiest way in R to get a vector of element frequencies in a vector

More articles: