Counting frequency when comparing against alternating groups of rows in data.frame

I have a table that I want

  • group every four lines into sequential groups

  • compare each line with 4 lines in the previous group.

Basically, I will use four lines at a time as a set of references, which will compare each line from the next group of four.

Specifically, given the row in group x, I want to count how many rows in the previous group (i.e. group x-1) have a value that is less than or equal to the value in the row of interest.

I want to do this for every line.

Hence, I want to count, for each row in the second group of four rows (say 5 to 8), the number of rows that have a value less than or equal to it in the first (say rows 1 to 4). Then lines 5 through 8 become the next reference group for the next four lines (9 through 12). Etc ...

Row Values
1   1.35
2   0.71
3   1.00
4   0.07
5   0.53
6   0.12
7   0.36
8   2.03
9   3.83
10  1.30
11  2.17
12  1.71
13  1.52
14  1.27
15  0.29
16  0.05
17  0.14

      

The result will look like this:

Row Values  Count
1   1.35    
2   0.71    
3   1.00    
4   0.07    
5   0.53    1
6   0.12    1
7   0.36    1
8   2.03    4
9   3.83    4
10  1.30    3
11  2.17    4
12  1.71    3
13  1.52    1
14  1.27    0
15  0.29    0
16  0.05    0
17  0.14    1

      

+3


source to share


3 answers


You can try (if df

is your data.frame):



sdf<-split(df$Values,(df$Row-1)%/%4)
c(rep(NA,4),unlist(Map(f=function(x,y)
      findInterval(x,sort(y)),sdf[-1],sdf[-length(sdf)]),use.names=F))
#[1] NA NA NA NA  1  1  1  4  4  3  4  3  1  0  0  0  1

      

+2


source


You can try this:



dat<-data.frame(row=c(1:length(z)),Values=z,ceiling=c(rep(NA,length(z))),count=c(rep(NA,length(z))))
#where z is a vector of your values.

for(x in 1:dim(dat)[1]) {
    dat$ceiling[x]<-ceiling(x/4)
    dat$count[x]<-length(which(dat$Values[dat$ceiling == (dat$ceiling[x]-1)] <= dat$Values[x]))
}

      

0


source


Use the function ceiling

with lapply

or vapply

.

ceiling

takes one numeric argument x

and returns a numeric vector containing the smallest integers at least matching elementsx

  • To achieve the desired effect, divide x by the number of lines you want in each group.

    ceiling(x/y) #where x = the row number and y = the number of rows per group
    
          

(Assuming df

is your data.frame):

From lapply

:

z <- df$Values
Groups <- ceiling(seq(z)/4)
df$Count <- 
  unlist(lapply(seq(z), function(x) sum(z[x] >= z[Groups == Groups[x] - 1])))

      

or with vapply

:

df$Count <- 
  vapply(seq(z), function(x) sum(z[x] >= z[Groups == Groups[x] - 1]), integer(1))

      


If you need one command:

df$Count <- 
  with(df,unlist(lapply(seq(Values), function(x) 
  sum(Values[x] >= Values[ceiling(seq(Values)/4) == ceiling(seq(Values)/4)[x] - 1]))))

      

0


source







All Articles