Replace a word with one word without leading @ in R
I am trying to perform a data string replacement in R. I need to find all words preceded by "@" (no space, like @word) and change "@" to "!" (for example, from @word to!). At the same time, it leaves other instances of "@" unchanged (for example, @ or @@ or @ [@]). For example, this is my original dataframe (for modification: @def, @jkl, @stu):
> df = data.frame(number = 1:4, text = c('abc @def ghi', '@jkl @ mno', '@[@] pqr @stu', 'vwx @@@ yz'))
> df
number text
1 1 abc @def ghi
2 2 @jkl @ mno
3 3 @[@] pqr @stu
4 4 vwx @@@ yz
And here's what I need to look like this:
> df_result = data.frame(number = 1:4, text = c('abc !def ghi', '!jkl @ mno', '@[@] pqr !stu', 'vwx @@@ yz'))
> df_result
number text
1 1 abc !def ghi
2 2 !jkl @ mno
3 3 @[@] pqr !stu
4 4 vwx @@@ yz
I tried with
> gsub('@.+[a-z] ', '!', df$text)
[1] "abc !ghi" "!@ mno" "!@stu" "vwx @@@ yz"
But the result is not desired. Any help is greatly appreciated.
Thank.
source to share
What about
gsub("(^| )@(\\w)", "\\1!\\2", df$text)
# [1] "abc !def ghi" "!jkl @ mno" "@[@] pqr !stu" "vwx @@@ yz"
This matches a character @
at the beginning of a line or after a space. Then we fix the word character after the character @
and replace @
with !
.
Clarification courtesy of regex101.com :
-
(^| )
- the first group of capture;^
approves the position at the beginning of the line;|
stands for "or"; white space literally runs through the space. -
@
matches character@
literally (case sensitive) -
(\\w)
- the second capture group, it stands for the word character
The replacement string \\1!\\2
replaces the regex match with the first capturing group ( \\1
) !
followed by the second capturing group ( \\2
).
source to share
You can use positive viewing (?=...)
gsub("@(?=[A-Za-z])", "!", df$text, perl = TRUE)
[1] "abc !def ghi" "!jkl @ mno" "@[@] pqr !stu" "vwx @@@ yz"
From the documentation page "Regular Expressions Used in R":
The patterns (? = ...) and (?! ...) are zero-width positive and negative assertions: they match if an attempt to match ... forward from the current position is successful (or not)), but don't use no characters in the processed string.
source to share