Can the OR operator be used to specify a pattern in the stringr str_extract_all function?
I am looking at several cells in a data frame and trying to extract any of several character sequences; there is only one of these sequences per cell.
This is what I mean:
dF$newColumn = str_extract_all(string = "dF$column1", pattern ="sequence_1|sequence_2")
Am I pasting the syntax here? Can I pull things like this with stringr? Please correct my ignorance!
+3
source to share
1 answer
Yes you can use |
as it stands for boolean or regex. Here's an example:
vec <- c("abc text", "text abc", "def text", "text def text")
library(stringr)
str_extract_all(string = vec, pattern = "abc|def")
Result:
[[1]]
[1] "abc"
[[2]]
[1] "abc"
[[3]]
[1] "def"
[[4]]
[1] "def"
However, in your command, you must replace "dF$column1"
with dF$column1
(without quotes).
+1
source to share