How do I make the grepl function specific?

My data frame is shown below. I need to extract the data of a specific row according to the column name "geneID" one by one. I am using a function grepl

.

#Data frame:geneDf  
geneID=c("EGFR","Her2","PTENPP","PTEN")
patient1=c(12,23,56,23)
patient2=c(23,34,11,6)
patient3=c(56,44,32,45)
patient4=c(23,64,45,23)
geneDf=data.frame(patient1,patient2,patient3,patient4,geneID)

geneDf
  patient1 patient2 patient3 patient4 geneID
1       12       23       56       23   EGFR
2       23       34       44       64   Her2
3       56       11       32       45 PTENPP
4       23        6       45       23   PTEN

      

The first three lines work well.

targetGene<-subset(geneDf,grepl(geneDf$geneID[1],geneDf$geneID))
targetGene
  patient1 patient2 patient3 patient4 geneID
1       12       23       56       23   EGFR

      

When I fetch the 4th row data, I get this:

targetGene<-subset(geneDf,grepl(geneDf$geneID[4],geneDf$geneID))
targetGene
  patient1 patient2 patient3 patient4 geneID
3       56       11       32       45 PTENPP
4       23        6       45       23   PTEN

      

It looks like other data, in this case, the third row of the "geneID" column, which includes the contents of the fourth row, is also matched. What happened to my team? How do I do this to collect data for a specific row each time?

+3


source to share


2 answers


You may need word boundary

ie \\b

or use

subset(geneDf, grepl(paste0('^', geneID[4], '$'), geneID))
#  patient1 patient2 patient3 patient4 geneID
#4       23        6       45       23   PTEN

      



or

subset(geneDf, grepl(paste0('\\b', geneID[4], '\\b'), geneID))
#   patient1 patient2 patient3 patient4 geneID
#4       23        6       45       23   PTEN

      

+5


source


@akrun answered your specific question, but if you want to create subsets of your data according to another variable, you might also be interested in the function split

:



split(geneDf, geneDf$geneID)
## $EGFR
##   patient1 patient2 patient3 patient4 geneID
## 1       12       23       56       23   EGFR
## 
## $Her2
##   patient1 patient2 patient3 patient4 geneID
## 2       23       34       44       64   Her2
## 
## $PTEN
##   patient1 patient2 patient3 patient4 geneID
## 4       23        6       45       23   PTEN
## 
## $PTENPP
##   patient1 patient2 patient3 patient4 geneID
## 3       56       11       32       45 PTENPP
## 

      

+4


source







All Articles