How to find a string within a character in R

I know this is a very naive question, but I've tried a lot but haven't found a way to count the number of occurrences of a specified substring within a character string in R.

For example:

str <- "Hello this is devavrata! here, say again hello"

      

Now I want to find the number of occurrences hello

, ignoring case. In this example, the answer should be 2.
EDIT: I'm wondering that when I find ello th

it str_count

will give an entry 1

, but I want the exact word surrounded by the appearance of spaces to give in this case zero

.
For example, if I want to search very good

on a specific line, eg: -

It is very good to speak like thevery good


And the meeting shouldn't be 1

here 2

. I hope you understand.

+3


source to share


3 answers


You can also try:



 library(stringi)
  stri_count(str, regex="(?i)hello")
  #[1] 2


  str1 <- "It is very good to speak like thevery good"
  stri_count(str1, regex="\\b(?i)very good\\b")
 #[1] 1

      

+4


source


Perhaps the simplest and easiest way would be to use str_count

fromstringr

str <- "Hello this is devavrata! here, say again hello"
library(stringr)
str_count(str, ignore.case("hello"))
# [1] 2

      

Two basic R methods:



length(grep("hello", strsplit(str, " ")[[1]], ignore.case = TRUE))
# [1] 2

      

and

sum(gregexpr("hello", str, ignore.case = TRUE)[[1]] > 0)
# [1] 2

      

+2


source


I'm late to the party, but I think the function termco

from the package qdap

does exactly what you want. You use leading and / or trailing spaces to control word boundaries, as shown in the example below:

x <- c("Hello this is devavrata! here, say again hello",
    "It is very good to speak like thevery good")

library(qdap)
(out <- termco(x, id(x), list("hello", "very good", " very good ")))

##   x word.count     hello very good very good
## 1 1          8 2(25.00%)         0         0
## 2 2          9         0 2(22.22%) 1(11.11%)

## To get a data frame of pure counts:
out %>% counts()

##   x word.count hello very good very good
## 1 1          8     2         0         0
## 2 2          9     0         2         1

      

+2


source







All Articles