Loop with dplyr on each line of the data block
I have a dataframe
df<-data.frame(var1=c(10,20,30,40,50),var2=c(rep(0.3,5)),BYGROUP_OBSNUM=c(0:4))
var1 var2 BYGROUP_OBSNUM
10 0.3 0
20 0.3 1
30 0.3 2
40 0.3 3
50 0.3 4
I need to do the calculations for each row using dplyr, as my real framework is really huge and dplyr is very efficient.
What I want -
var1 var2 BYGROUP_OBSNUM VAR1_NEW
10 0.3 0 10
20 0.3 1 23
30 0.3 2 36.9
40 0.3 3 51.07
50 0.3 4 65.321
How is this achieved -
var1 var2 BYGROUP_OBSNUM VAR1_NEW
10 0.3 0 10
20 0.3 1 20+10*0.3
30 0.3 2 30+20*0.3+10*0.3^2
40 0.3 3 40+30*0.3+20*0.3^2+10*0.3^3
50 0.3 4 50+40*0.3+30*0.3^2+20*0.3^3+10*0.3^4
Therefore, for each line, the formula is -
var1[i]+lag(var1,1)*var2^1+lag(var1,2)*var2^2 +....
until lag (var1) reaches the line where BYGROUP_OBSNUM is 0
What I have achieved so far -
df1<-df %>%
mutate(var3=ifelse ((!(var2==0) | (!(BYGROUP_OBSNUM==0))), var2, 0)) %>%
rowwise()%>%
ungroup() %>%
mutate(var1_new=var1+lag(var1,1)*var2)
I need to change the last line so that the formula lags from lag (var1,1) to lag (var1, BYGROUP_OBSNUM) for each row, and the cardinality of var2 also increases from 1 to BYGROUP_OBSNUM. How to do it?
source to share
Here is a custom function that can be used with dplyr to get the results you are after. It also works with the function group_by
.
my.func <- function(x){mapply(function(v1,v2,n) {
if(n == 1){
as.numeric(v1[n])
} else{
sum(v1, x[rev(seq(1:(n-1))),1] * v2 ^ seq(1:(n-1)))
}
}, x[,"var1"], x[,"var2"], seq(1:nrow(x)))
}
df <- df %>%
# group_by(COLUMNS, TO, GROUP, BY) %>%
do(data.frame(., my.func(.))) %>%
select(var1, var2, BYGROUP_OBSNUM, VAR1_NEW = my.func...)
source to share
The final decision is made -
df<-data.frame(var1=c(1:10),var2=c(rep(c(0,0.1),each=5)),BYGββROUP_OBSNUM=c(0:4))
my.func <- function(x){mapply(function(v1,v2,v3,n) {
if(v2==0 | v3==0){ as.numeric(v1) }
else {
sum(v1, x[rev(seq(1:(n-1))),1][1:v3] * v2 ^ seq(1:(n-1))[1:v3]) } },
x[,"var1"], x[,"var2"], x[,"BYGROUP_OBSNUM"],seq(1:nrow(x)))
}
df1 <- df %>%
do(data.frame(., my.func(.))) %>%
mutate(VAR1_NEW = my.func...)%>%
select(-my.func...)
executed 100 thousand lines in 1.42 minutes! This feature helped a lot! Thank!
source to share