How do I calculate the column that depends on the function that uses the value of the variable of each row?

Question

How do I calculate the column that depends on the function that uses the value of the variable of each row?

This is a layout based on mtcars

what I would like to do:

calculate a column that counts the number of cars that have less offset ( disp

) of the current row within the same transmission type
category ( am

)
the expected column are the values I would like to get
try1

- one try with a function findInterval

, the problem is that I can't count it across the category dependent subsets ( am

)

I've tried solutions with *apply

, but somehow I could never get the function called to only work on a subset that depends on the value of the variable of the string being processed (hopefully this makes sense).

x = mtcars[1:6,c("disp","am")]
# expected values are the number of cars that have less disp while having the same am
x$expected = c(1,1,0,1,2,0) 
#this ordered table is for findInterval
a = x[order(x$disp),] 
a
# I use the findInterval function to get the number of values and I try subsetting the call
# -0.1 is to deal with the closed intervalq 
x$try1 = findInterval(x$disp-0.1, a$disp[a$am==x$am])
x
# try1 values are not computed depending on the subsetting of a

Any decision will be followed; using the function findInterval

is optional.

I would prefer to have a more general solution to calculate the value of a column by calling a function that takes values from the current row to calculate the expected value.

+3

r

wotter Jul 29. 15 at 13:16

source to share

2 answers

As @dimitris_ps pointed out, the previous solution neglects duplicate values. As a consequence, a remedy is provided.

library(dplyr)
x %>% 
  group_by(am) %>%
  mutate(expected=findInterval(disp, sort(disp) + 0.0001))

or

library(data.table)
setDT(x)[, expected:=findInterval(disp, sort(disp) + 0.0001), by=am]

+4

Khashaa Jul 29. 15 at 13:27

source to share

dimitris_ps · Accepted Answer · 2015-07-29T13:50:55+0000

Based on @ Hasha's logic, this is my approach

library(dplyr)
mtcars %>% 
  group_by(am) %>%
  mutate(expected=match(disp, sort(disp))-1)

How do I calculate the column that depends on the function that uses the value of the variable of each row?

More articles: