How do I make the grepl function specific?

My data frame is shown below. I need to extract the data of a specific row according to the column name "geneID" one by one. I am using a function grepl


#Data frame:geneDf  

  patient1 patient2 patient3 patient4 geneID
1       12       23       56       23   EGFR
2       23       34       44       64   Her2
3       56       11       32       45 PTENPP
4       23        6       45       23   PTEN


The first three lines work well.

  patient1 patient2 patient3 patient4 geneID
1       12       23       56       23   EGFR


When I fetch the 4th row data, I get this:

  patient1 patient2 patient3 patient4 geneID
3       56       11       32       45 PTENPP
4       23        6       45       23   PTEN


It looks like other data, in this case, the third row of the "geneID" column, which includes the contents of the fourth row, is also matched. What happened to my team? How do I do this to collect data for a specific row each time?


source to share

2 answers

You may need word boundary

ie \\b

or use

subset(geneDf, grepl(paste0('^', geneID[4], '$'), geneID))
#  patient1 patient2 patient3 patient4 geneID
#4       23        6       45       23   PTEN



subset(geneDf, grepl(paste0('\\b', geneID[4], '\\b'), geneID))
#   patient1 patient2 patient3 patient4 geneID
#4       23        6       45       23   PTEN




@akrun answered your specific question, but if you want to create subsets of your data according to another variable, you might also be interested in the function split


split(geneDf, geneDf$geneID)
## $EGFR
##   patient1 patient2 patient3 patient4 geneID
## 1       12       23       56       23   EGFR
## $Her2
##   patient1 patient2 patient3 patient4 geneID
## 2       23       34       44       64   Her2
## $PTEN
##   patient1 patient2 patient3 patient4 geneID
## 4       23        6       45       23   PTEN
##   patient1 patient2 patient3 patient4 geneID
## 3       56       11       32       45 PTENPP




All Articles