Quantile normalizes one column in R

I have a column in my dataframe in R, data $ height. The values ​​range from 0 to 400. I want to normalize the values ​​in the column so that the resulting values ​​are between 0-1 and are quantiles, that is, the median value in the dataset should reflect 0.5 as the newer value.

Guess how to do it.

+3


source to share


2 answers


The R function ppoints

is a common way of matching values ​​at their percentile ranks.

See its argument a

-

The setting a=1

takes the lowest value at 0 and the highest value at 1



The value a=0

takes the smallest value at 1 / (n + 1) and the largest value at n / (n + 1)

By default it has value = 3/8 (if n is 10 or less) or = 1/2 (when n is greater than 10)

This function is used by other functions in R. For example, it is called qqnorm

to execute ordinary quantum-quantile plots.

+3


source


Do you want some rank

, for example, as in



> set.seed(1)
> exdf <- data.frame(height = runif(5, min=0, max=400))
> exdf$r1 <- (rank(exdf$height) - 1) / (length(exdf$height)-1)
> exdf$r2 <- (rank(exdf$height)-1/2) /  length(exdf$height)
> exdf 
     height   r1  r2
1 106.20347 0.25 0.3
2 148.84956 0.50 0.5
3 229.14135 0.75 0.7
4 363.28312 1.00 0.9
5  80.67277 0.00 0.1

      

+2


source







All Articles