Efficient adhesion using dplyr in R
I have a fairly simple problem that I could hack to, but I would rather do something more efficient in R using things like dplyr.
. Speaking of which, this question is probably dead simply to someone who's pretty good with this package.
I have a dataframe, 3 columns and 30 rows (for simplicity). I would like to calculate the 87th percentile score. After that, I would like to normalize this count within a range between 0 and 1. Quite simply: revocation normalization is done with
So the second line below is using force dplyr
.
DF <- data.frame(matrix(runif(90, min=0, max=100), ncol=3,nrow=30))
DF_87th_percentile <- DF %>%
summarise_each(funs(quantile(., c(0.87)))
After that, I have the 87th percentile score calculated, but then I stumble and start creating variables min
and max
,
min <- apply(DF, 2, min)
max <- apply(DF, 2, max)
and then
normalized_score <- (DF_87th_percentile - min) / (max - min)
Is there a way to rewrite the last parts with dplyr
? How can you possibly hook the last pieces? So far, my efforts have not been good. Thank you for your help.
source to share
You need to write normalization as a function in order to chain it dplyr
. For example:
mynorm <- function(x) { (x - min(x)) / (max(x) - min(x)) }
DF <- data.frame(matrix(runif(90, min=0, max=100), ncol=3,nrow=30))
DF %>%
summarise_each(funs(quantile(., c(0.87)))) %>%
mynorm()
Examples of results:
X1 X2 X3
1 0 1 0.986836
source to share