# R. Store rows from one data frame based on values ββin the second

I have two data frames. One data frame has four columns, the fourth column contains a number that refers to the physical location.

The second data frame also has four columns. Here columns 2 and 3 refer to borders.

I am trying to store each line from a dataframe when the number given in V4 falls between V2 and V3 given on any line of the second dataframe. Therefore, if 62765 from data frame one V4 is between 20140803-20223538, 63549983-63556677, or 52236330-52315441, the whole row must be stored in data frame two in the example unless it is omitted.

I would also like to be able to do the opposite. Save each line when V4 is not between V2 and V3 in the second data frame. Any help here would be greatly appreciated.

data frame

``````V1 V2         V3  V4
10 rs11511647  0  62765
10 rs12218882  0  84172
10 rs10904045  0  84426
10 rs11252127  0  88087
```

```

Data frame two

``````V1  V2         V3     V4
7 20140803 20223538   7A5
19 63549983 63556677  A1BG
10 52236330 52315441  A1CF
```

```
+3

source to share

Here's a simple estimate:

``````# check whether values of df1\$V4 are between df2\$V2 and df2\$V3
idx <- sapply(df1\$V4, function(x) any(x >= df2\$V2 & x <= df2\$V3))

# remove rows
df1[idx, ]

# retain rows
df1[!idx, ]
```

```
+2

source

REVISED

Using @akrun's data and taking inspiration from @Sven Hohenstein's code, here's another approach.

``````df1 <- data.frame(
V1 = c(10,10,10,10),
V2 = c("rs11511647","rs12218882","rs10904045", "rs11252127"),
V3 = c(0,0,0,0),
V4 = c(62765, 63549985, 84426, 88087),
stringsAsFactors=FALSE)

df2 <- data.frame(
V1 = c(7, 19, 10),
V2 = c(20140803, 63549983, 52236330),
V3 = c(20223538, 63556677, 52315441),
V4 = c("7A5", "A1BG", "A1CF"),
stringsAsFactors=FALSE)

library(dplyr)

df1 %>%
rowwise %>%
mutate(test = ifelse(any(V4 >= df2\$V2 & V4 <= df2\$V3), 1, 0)) %>%
filter(test == 1)

#  V1         V2 V3       V4 test
#1 10 rs12218882  0 63549985    1
```

```
+1

source

Here's another possibility

``````idx <- sapply(seq(nrow(df1)), function(y) {
df1\$V4[y] > df2[y,2] & df1\$V4[y] < df2[y,3]
})
df1[match(TRUE, idx),]
#   V1         V2 V3       V4
# 2 10 rs12218882  0 63549985
```

```
0

source

All Articles