R retrieves elements from a string

Question

R retrieves elements from a string

I am trying to extract all words containing two adjacent vowels in a given string.

x <- "The team sat next to each other all year and still failed."

Results will be "team", "each", "year", "failed"

So far I've tried using [aeiou][aeiou]

help for this regmatches

, but that only gives me a word part.

Thank.

+3

regex r

jason.nash 06 dec. '14 at 6:31

source to share

4 answers

words <-unlist(strsplit(x, " "))
words[grepl("[aeiou]{2}", words)]
#[1] "team"    "each"    "year"    "failed."

If you want to clear up the punctuation, it could be:

> words <-unlist(strsplit(x, "[[:punct:] ]"))
> words[grepl("[aeiou]{2}", words)]

+4

42- 06 dec. '14 at 6:41

source to share

\w*[aeiou][aeiou]\w*

Try it. Check out the demo.

https://regex101.com/r/hJ3zB0/5

+1

vks 06 dec. '14 at 6:33

source to share

The same with stringr

library(stringr)
xx <- str_split(x, " ")[[1]]
xx[str_detect(xx, "[aeiou]{2}")]
## [1] "team"    "each"    "year"    "failed."

Edit

As @akrun showed, this can be simplified to

str_extract_all(x, "\\w*[aeiou]{2}\\w*")[[1]]
## [1] "team"   "each"   "year"   "failed"

+1

johannes 06 dec. 14 at 11:53

source to share

hwnd · Accepted Answer · 2014-12-06T06:33:39+0000

You can place \w*

before and after the character class according to the characters of the word "zero or more".

x <- "The team sat next to each other all year and still failed."
regmatches(x, gregexpr('\\w*[aeiou]{2}\\w*', x))[[1]]
# [1] "team"   "each"   "year"   "failed"

R retrieves elements from a string

Edit

More articles: