Counting frequency when comparing against alternating groups of rows in data.frame
I have a table that I want
-
group every four lines into sequential groups
-
compare each line with 4 lines in the previous group.
Basically, I will use four lines at a time as a set of references, which will compare each line from the next group of four.
Specifically, given the row in group x, I want to count how many rows in the previous group (i.e. group x-1) have a value that is less than or equal to the value in the row of interest.
I want to do this for every line.
Hence, I want to count, for each row in the second group of four rows (say 5 to 8), the number of rows that have a value less than or equal to it in the first (say rows 1 to 4). Then lines 5 through 8 become the next reference group for the next four lines (9 through 12). Etc ...
Row Values
1 1.35
2 0.71
3 1.00
4 0.07
5 0.53
6 0.12
7 0.36
8 2.03
9 3.83
10 1.30
11 2.17
12 1.71
13 1.52
14 1.27
15 0.29
16 0.05
17 0.14
The result will look like this:
Row Values Count
1 1.35
2 0.71
3 1.00
4 0.07
5 0.53 1
6 0.12 1
7 0.36 1
8 2.03 4
9 3.83 4
10 1.30 3
11 2.17 4
12 1.71 3
13 1.52 1
14 1.27 0
15 0.29 0
16 0.05 0
17 0.14 1
source to share
You can try this:
dat<-data.frame(row=c(1:length(z)),Values=z,ceiling=c(rep(NA,length(z))),count=c(rep(NA,length(z))))
#where z is a vector of your values.
for(x in 1:dim(dat)[1]) {
dat$ceiling[x]<-ceiling(x/4)
dat$count[x]<-length(which(dat$Values[dat$ceiling == (dat$ceiling[x]-1)] <= dat$Values[x]))
}
source to share
Use the function ceiling
with lapply
or vapply
.
ceiling
takes one numeric argumentx
and returns a numeric vector containing the smallest integers at least matching elementsx
-
To achieve the desired effect, divide x by the number of lines you want in each group.
ceiling(x/y) #where x = the row number and y = the number of rows per group
(Assuming df
is your data.frame):
From lapply
:
z <- df$Values
Groups <- ceiling(seq(z)/4)
df$Count <-
unlist(lapply(seq(z), function(x) sum(z[x] >= z[Groups == Groups[x] - 1])))
or with vapply
:
df$Count <-
vapply(seq(z), function(x) sum(z[x] >= z[Groups == Groups[x] - 1]), integer(1))
If you need one command:
df$Count <-
with(df,unlist(lapply(seq(Values), function(x)
sum(Values[x] >= Values[ceiling(seq(Values)/4) == ceiling(seq(Values)/4)[x] - 1]))))
source to share