How do I create a new variable associated with the adjacent ID position in R?
This is what my data looks like:
ID XYZ N_ID1 N_ID2
1 10 3 4
2 20 8 2
3 50 6 5
4 100 2 6
5 70 7 10
6 25 1 3
7 30 2 4
8 35 6 9
. . . .
. . . .
. . . .
So, I have two variables called "N_ID1" and "N_ID2" here, which are 2 neighbors of the "ID" variable.
I want to create a new variable based on: "XYZ", "N_ID1", "N_ID2", so that the new variable is the mean "XYZ" for the values at ID location "N_ID1" and "N_ID2".
So, if we look at the first line, where ID = 1, there is "N_ID1" = 3, "N_ID2" = 4. Now my new variable should be the average "XYZ" with ID = 3 and the Value "XYZ" with ID = 4. Similarly for other lines.
This is how my final result should look like:
ID XYZ N_ID1 N_ID2 New_Variable
1 10 3 4 (50+100)/2 = 75
2 20 8 2 (35+20)/2 = 27.5
3 50 6 5 (25+70)/2 = 47.5
4 100 2 6 .
5 70 7 10 .
6 25 1 3 .
7 30 2 4 .
8 35 6 9 .
. . . . .
. . . . .
. . . . .
So, as you can see above, the first value in "New_Variable" = 75, which is the average of ID # 3 and ID # 4 for "XYZ"
Can anyone please tell me how to do this in R?
source to share
match
each N_IDx
before ID
, subset XYZ
, add +
and split.
Reduce(`+`,
lapply(dat[c("N_ID1","N_ID2")], function(x) dat$XYZ[match(x,dat$ID)] )
) / 2
#[1] 75.0 27.5 47.5 22.5 NA 30.0 60.0 NA
Without a functional approach, it would be simple:
with(dat, (XYZ[match(N_ID1, ID)] + XYZ[match(N_ID2, ID)]) / 2 )
But it gets painful if you have a lot of variables to sum up.
source to share