Choosing the number of smallest 5 values for each row in a data frame in r

Question

Choosing the number of smallest 5 values for each row in a data frame in r

Let's say I have a dataframe:

df=df=data.frame('var1'=c(1,3,5,7),'var2'=c(4,6,8,10),var3=c(11,12,13,14))
df

  var1 var2 var3
    1    4   11
    3    6   12
    5    8   13
    7   10   14

Now I am calculating the distance of each line with every other line using var1 and var2

library(fields)
df_dist=df_dist=rdist(df[,1:2])
df_dist
         1        2        3        4
1 0.000000 2.828427 5.656854 8.485281
2 2.828427 0.000000 2.828427 5.656854
3 5.656854 2.828427 0.000000 2.828427
4 8.485281 5.656854 2.828427 0.000000

Now my goal is to select the two column names from each row that have the lowest values in that row (excluding 0, i.e. distance from itself), so for row 1 the output should be colname = 2 and 3, similarly for row 2 the output should be 1 and 3, etc.

I can do this using a for loop, but it takes a long time for a large dataset, is there a better way to use apply, lapply, etc. that might save some money this time.

The loop code for the loop looks like this:

d=as.data.frame(df_dist)
#Setting the column and row names as var3 values
colnames(d)<-df$var3
rownames(d)<-df$var3

#Intitialiazing variable e
e<-NULL


for (i in 1:nrow(d))
{

  tmp=colnames(d)[order(d[i,], decreasing=FALSE)][2:3]  
  e<-rbind(e,tmp)
}

f=as.data.frame(e)

rownames(f)<-df$var3

+3

r

bakas 18 jul. 17 at 7:58

source to share

1 answer

Florian · Accepted Answer · 2017-07-18T08:09:38+0000

It works:

df = read.table(text="1        2        3        4
1 0.000000 2.828427 5.656854 8.485281
2 2.828427 0.000000 2.828427 5.656854
3 5.656854 2.828427 0.000000 2.828427
4 8.485281 5.656854 2.828427 0.000000")

t(apply(df,1,function(x) colnames(df)[order(x)[2:3]]  ))

OUTPUT:

  [,1] [,2]
1 "X2" "X3"
2 "X1" "X3"
3 "X2" "X4"
4 "X3" "X2"

So for row 4, column X3 contains the lowest value and X2 contains the second.

Hope this helps!

Choosing the number of smallest 5 values ​​for each row in a data frame in r

More articles:

Choosing the number of smallest 5 values for each row in a data frame in r