String Conversion to R | Grouping the words of a string

Question

String Conversion to R | Grouping the words of a string

I want to group the words of a string (below)

text="Lorem,ipsum,dolor,sit,amet,consectetuer"

like this

textNew="Lorem ipsum,ipsum dolor,dolor sit,sit amet,amet consectetuer"

Thank.

+3

string regex r

sidpat 22 Aug '14 at 7:29

source to share

5 answers

Here's one of the options:

x <- strsplit(text, ",")[[1]]
paste0(sapply(1:(length(x)-1), function(z) paste(x[c(z, z+1)], collapse = " ")), collapse = ",")
[1] "Lorem ipsum,ipsum dolor,dolor sit,sit amet,amet consectetuer"

+4

docendo discimus 22 Aug 14 at 7:54

source to share

Something similar.

text="Lorem,ipsum,dolor,sit,amet,consectetuer"
text2 <- unlist(strsplit(text, ","))
textNew=paste0(sapply(1:(length(text2)-1),function(i,y=text2){paste(y[i],y[i+1])}),collapse=",")

+2

sidpat 22 Aug '14 at 8:00

source to share

You can also do:

  library(stringr)
   txt2 <- str_extract_all(text, "[^,]+")[[1]]
   paste(paste(txt2[-length(txt2)],txt2[-1],sep=" "), collapse=", ")
   #[1] "Lorem ipsum, ipsum dolor, dolor sit, sit amet, amet consectetuer"

or

  library(gsubfn)
   paste(strapply(text, "([^,]+),(?=([^,]+))", paste, backref= -2, perl=TRUE)[[1]], collapse=",")
   #[1] "Lorem ipsum,ipsum dolor,dolor sit,sit amet,amet consectetuer"

+2

akrun 22 Aug '14 at 9:21

source to share

You can use these functions from stringi

package

require(stringi)
text <- "Lorem,ipsum,dolor,sit,amet,consectetuer"
words <- stri_split_fixed(text,",")[[1]]
stri_join(words[-length(words)]," ",words[-1],collapse = ", ")
## [1] "Lorem ipsum, ipsum dolor, dolor sit, sit amet, amet consectetuer"

some guidelines :)

stringi <- function(){
  words <- stri_split_fixed(text,",")[[1]]
  stri_join(words[-length(words)]," ",words[-1],collapse = ", ")
}

gsubAvinash <- function(){
  f <- gsub(",([^,]*)", " \\1,\\1", text, perl=TRUE)
  result <- gsub(",[^,]*$", "", f, perl=TRUE)
  result
}

strsplitBeggineR <- function(){
  x <- strsplit(text, ",")[[1]]
  paste0(sapply(1:(length(x)-1), function(z) paste(x[c(z, z+1)], collapse = " ")), collapse = ",")
}

stringrAkrun <- function(){
  txt2 <- str_extract_all(text, "[^,]+")[[1]]
  paste(paste(txt2[-length(txt2)],txt2[-1],sep=" "), collapse=", ")
}

require(microbenchmark)
microbenchmark(stringi(), gsubAvinash(),strsplitBeggineR(),stringrAkrun())
Unit: microseconds
               expr     min       lq   median       uq     max neval
          stringi()   8.657  10.6090  16.5005  17.6730  41.058   100
      gsubAvinash()  14.506  17.1055  20.2105  22.2040  97.399   100
 strsplitBeggineR()  53.609  59.7755  64.9470  68.3105 121.767   100
     stringrAkrun() 148.036 157.4715 162.4885 168.2880 342.471   100

+2

bartektartanus 23 Aug 14 at 22:13

source to share

Avinash Raj · Accepted Answer · 2014-08-22T08:10:22+0000

Through the gsub

function,

> text="Lorem,ipsum,dolor,sit,amet,consectetuer"
> f <- gsub(",([^,]*)", " \\1,\\1", text, perl=TRUE)
> result <- gsub(",[^,]*$", "", f, perl=TRUE)
> result
[1] "Lorem ipsum,ipsum dolor,dolor sit,sit amet,amet consectetuer"

String Conversion to R | Grouping the words of a string

More articles: