Efficient adhesion using dplyr in R

I have a fairly simple problem that I could hack to, but I would rather do something more efficient in R using things like dplyr.

. Speaking of which, this question is probably dead simply to someone who's pretty good with this package.

I have a dataframe, 3 columns and 30 rows (for simplicity). I would like to calculate the 87th percentile score. After that, I would like to normalize this count within a range between 0 and 1. Quite simply: revocation normalization is done with

enter image description here

So the second line below is using force dplyr

.

DF <- data.frame(matrix(runif(90, min=0, max=100), ncol=3,nrow=30))
DF_87th_percentile <- DF %>% 
    summarise_each(funs(quantile(., c(0.87)))

      

After that, I have the 87th percentile score calculated, but then I stumble and start creating variables min

and max

,

min <- apply(DF, 2, min)
max <- apply(DF, 2, max)

      

and then

normalized_score <- (DF_87th_percentile - min) / (max - min)

      

Is there a way to rewrite the last parts with dplyr

? How can you possibly hook the last pieces? So far, my efforts have not been good. Thank you for your help.

+3


source to share


2 answers


You need to write normalization as a function in order to chain it dplyr

. For example:

mynorm <- function(x) { (x - min(x)) / (max(x) - min(x)) }

DF <- data.frame(matrix(runif(90, min=0, max=100), ncol=3,nrow=30))

DF %>% 
    summarise_each(funs(quantile(., c(0.87)))) %>%
    mynorm()

      



Examples of results:

  X1 X2       X3
1  0  1 0.986836

      

+2


source


I would think that you could just change the original call:



normalized_score <- DF %>% 
     summarise_each(funs( (quantile(., c(0.87))-min(.) )/(max(.)-min(.)) ))
 normalized_score
         X1        X2        X3
1 0.9081882 0.8308022 0.9266201

      

+1


source







All Articles