Remove extra space between letters in R using gsub ()

There are many answers on how to remove extra spaces between words, which is very simple. However, I find that removing the extra words inside is much more difficult. As a reproducible example, let's say I have a data vector that looks like this:

x <- c("L L C", "P O BOX 123456", "NEW YORK")

What I would like to do is something like this:

y <- gsub("(\\w)(\\s)(\\w)(\\s)", "\\1\\3", x)

But that leaves me with this:

[1] "LLC" "POBOX 123456" "NEW YORK"

Almost perfect, but I'd really like the second meaning to say "PO BOX 123456". Is there a better way to do this than what I am doing?

+3


source to share


1 answer


You can try this,

> x <- c("L L C", "P O BOX 123456", "NEW YORK")
> gsub("(?<=\\b\\w)\\s(?=\\w\\b)", "", x,perl=T)
[1] "LLC"           "PO BOX 123456" "NEW YORK" 

      



It simply removes the space that exists between two characters of the same word.

+5


source







All Articles