Vector-based subset data with repeated observations

Question

Vector-based subset data with repeated observations

I have the following data with two observations per subject:

SUBJECT <- c(8,8,10,10,11,11,15,15)
POSITION <- c("H","L","H","L","H","L","H","L")
TIME <- c(90,90,30,30,30,30,90,90)
RESPONSE <- c(5.6,5.2,0,0,4.8,4.9,1.2,.9)

DATA <- data.frame(SUBJECT,POSITION,TIME,RESPONSE)

I need DATA strings that have SUBJECT numbers that are in a vector, V:

V <- c(8,10,10)

How can I get both observations from DATA, whose SUBJECT number is in V, and these observations are repeated as many times as the corresponding SUBJECT number appears in V?

Desired output:

SUBJECT <- c(8,8,10,10,10,10)
POSITION <- c("H","L","H","L","H","L")
TIME <- c(90,90,30,30,30,30)
RESPONSE <- c(5.6,5.2,0,0,0,0)

OUT <- data.frame(SUBJECT,POSITION,TIME,RESPONSE)

I figured some variations of the% in% operator would do the trick, but it doesn't account for duplicate item numbers in V. Even if the item number is twice in V, I only get one copy of the corresponding rows in DATA.

I could also create a loop and add the appropriate observations, but this part is inside the bootstrap sampler and this parameter will significantly increase the computation time.

+3

r subset

Poca 05 Apr 17 at 22:48

source to share

1 answer

thelatemail · Answer 1 · 2017-04-05T22:51:23+0000

merge

- your friend:

merge(list(SUBJECT=V), DATA)
#  SUBJECT POSITION TIME RESPONSE
#1       8        H   90      5.6
#2       8        L   90      5.2
#3      10        H   30      0.0
#4      10        L   30      0.0
#5      10        H   30      0.0
#6      10        L   30      0.0

As @Frank suggests, this logic can be translated to data.table

or dplyr

or sql

whatever the left join will handle.

Vector-based subset data with repeated observations

More articles: