Preserve order in split-apply-comb tasks
Possible duplicate:
How to ddply () without sorting?
I have the following dataframe
dd1 = data.frame(cond = c("D","A","C","B","A","B","D","C"), val = c(11,7,9,4,3,0,5,2))
dd1
cond val
1 D 11
2 A 7
3 C 9
4 B 4
5 A 3
6 B 0
7 D 5
8 C 2
and now it is necessary to calculate the cumulative amounts related to the factor level in cond. The results should look like this:
> dd2 = data.frame(cond = c("D","A","C","B","A","B","D","C"), val = c(11,7,9,4,3,0,5,2), cumsum=c(11,7,9,4,10,4,16,11))
> dd2
cond val cumsum
1 D 11 11
2 A 7 7
3 C 9 9
4 B 4 4
5 A 3 10
6 B 0 4
7 D 5 16
8 C 2 11
It is important to get the result data frame in the same order as the original data frame because other variables are bound to it.
I tried ddply(dd1, .(cond), summarize, cumsum = cumsum(val))
it but didn't give the expected result.
thank
source to share
If you do it by hand, this is an option, then split()
, and unsplit()
with a suitable lapply()
inbetween will make it for you.
dds <- split(dd1, dd1$cond)
dds <- lapply(dds, function(x) transform(x, cumsum = cumsum(x$val)))
unsplit(dds, dd1$cond)
The last line gives
> unsplit(dds, dd1$cond)
cond val cumsum
1 D 11 11
2 A 7 7
3 C 9 9
4 B 4 4
5 A 3 10
6 B 0 4
7 D 5 16
8 C 2 11
I have separated three steps, but they could be stuffed or put into a function if you do a lot of that.
source to share