Replacement for-loops is applied to improve performance (with weighted.mean)
I'm an R newbie, so hopefully this is a solvable problem for some of you. I have a dataframe containing over a million data points. My goal is to compute a weighted average with a varying starting point.
To illustrate this frame (data.frame (matrix (c (1,2,3,2,2,1), 3,2)))
X1 X2
1 1 2
2 2 2
3 3 1
where X1 is data and X2 is sample weight.
I want to calculate a weighted average for X1 from a starting point 1 to 3, 2: 3, and 3: 3.
With a loop, I just wrote:
B <- rep(NA,3) #empty result vector
for(i in 1:3){
B[i] <- weighted.mean(x=A$X1[i:3],w=A$X2[i:3]) #shifting the starting point of the data and weights further to the end
}
With my real data this is impossible to compute because for each iteration the data.frame changes and the computation takes hours with no result.
Is there a way to implement a starting point for the variation using the apply command to improve performance?
Best regards, Ruben
+3
source to share
2 answers