How to return strings with a keyword inside a string contained in a cell in r
I thought it would be a simple line of code, but the solution to my problem is eluding me. I'm sure my limited R programming experience can be a source.
Data set
df <- structure(list(Key_MXZ = c(1731025L, 1731022L, 1731010L, 1730996L,
1722128L, 1722125L, 1722124L, 1722123L, 1722121L, 1722116L, 1722111L,
1722109L), Key_Event = c(1642965L, 1642962L, 1647418L, 1642936L,
1634904L, 1537090L, 1537090L, 1616520L, 1634897L, 1634892L, 1634887L,
1634885L), Number_Call = structure(c(11L, 9L, 10L, 12L, 1L, 3L,
2L, 4L, 5L, 6L, 8L, 7L), .Label = c("3004209178-2010-04468",
"3004209178-2010-04469", "3004209178-2010-04470", "3004209178-2010-04471",
"3004209178-2010-04472", "3004209178-2010-04475", "3004209178-2010-04477",
"3004209178-2010-04478", "3004209178-2010-04842", "3004209178-2010-04850",
"I wish to return this row with the header", "Maybe this row will work too"
), class = "factor")), .Names = c("Key_MXZ", "Key_Event", "Number_Call"
), class = "data.frame", row.names = c("1", "2", "3", "4", "5",
"6", "7", "8", "9", "10", "11", "12"))
In the last column, I have placed two rows among other data types that will be used to identify rows for the new data frame - using the phrase "this row". The end result might look like this:
Key_MXZ|Key_Event|Number_Call
1|1731025|1642965|I wish to return this row with the header
4|1730996|1642936|Maybe this row will work too
I've tried the following code variations and others invisible to break through with little success.
txt <- c("this row")
table1 <- df[grep(txt,df),]
table2 <- df[pmatch(txt,df),]
df[,3]<-is.logical(df[,3])
table3 <- subset(df,grep(txt,df[,3]))
Any ideas on this issue?
source to share
go from
df[grep("this row", df$Number_Call, fixed=TRUE),]
# Key_MXZ Key_Event Number_Call
#1 1731025 1642965 I wish to return this row with the header
#4 1730996 1642936 Maybe this row will work too
You just need to specify the actual column you want grep to try to match
fixed = TRUE searches for exact matches, and grep returns the indices of those elements in the list that match. If your match is a little more subtle, you can replace "this string" with a regular expression
source to share
Quite similar to DMT's answers. Below is the data.table method, which is fast if you have millions of rows:
setDT(df); setkey(df, Number_Call)
df[grep("this row", Number_Call, ignore.case = TRUE)]
Key_MXZ Key_Event Number_Call
1: 1731025 1642965 I wish to return this row with the header
2: 1730996 1642936 Maybe this row will work too
source to share
Here is the approach that the function uses qdap
Search
. This is a wrapper for agrep
, so it can do a fuzzy match and the degree of fuzziness can be set:
library(qdap)
Search(df, "this row", 3)
## Key_MXZ Key_Event Number_Call
## 1 1731025 1642965 I wish to return this row with the header
## 4 1730996 1642936 Maybe this row will work too
source to share