How to add "." after line in conditions in R
Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love you!","Winner")
The function should add "." if there is none of these characters [.?!] at the end of the sentence to end the sentence.
I tried to create a function in R using Regex, but I had some problems to only look at the end of the line.
source to share
The function below gsub
will add a period at the end of the sentence only if the sentence does not end with .
either ?
or or !
.
> Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love you!","Winner")
> gsub("^(?!.*[.?!]$)(.*)$", "\\1.", Data, perl=TRUE)
[1] "My name is Ernst." "I love chicken."
[3] "Hello, my name is Stan!" "Who?"
[5] "I Love you!" "Winner."
In regular expression, images are used to check the state. A negative lookhhead (?!.*[.?!]$)
will check for .
either ?
or !
at the end of a string. If it is present for the last time, then it skips the clause and no replacement will occur on that matching line. The replacement will only occur if there are no .
or ?
or characters left at the end !
.
OR
Through negative lookbehind and positive look,
> Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love you!","Winner")
> sub("(?<![!?.])(?=$)", ".", Data, perl=TRUE)
[1] "My name is Ernst." "I love chicken."
[3] "Hello, my name is Stan!" "Who?"
[5] "I Love you!" "Winner."
source to share
Here are some possible approaches:
1) If the last character is not a dot ,? or! then replace it with this character followed by a dot:
sub("([^.!?])$", "\\1.", Data)
For the data in the question, we get:
[1] "My name is Ernst." "I love chicken."
[3] "Hello, my name is Stan!" "Who?"
[5] "I Love you!" "Winner."
2) The gsubfn solution is even simpler. It replaces the empty () with a dot if the last character is not a dot ,! or?.
library(gsubfn)
gsubfn("[^.!?]()$", ".", Data)
3) This is used grepl
. If point,! or? this is the last character, then add a blank line and otherwise add a period.
paste0(Data, ifelse(grepl("[.!?]$", Data), "", "."))
4) It doesn't use regular expressions at all. He chooses the last character and, if his point,! or? it adds an empty line and otherwise adds a period:
paste0(Data, ifelse(substring(Data, nchar(Data)) %in% c(".", "!", "?"), "", "."))
source to share
Here's another solution.
x <- c('My name is Ernst.', 'I love chicken',
'Hello, my name is Stan!', 'Who?', 'I Love you!', 'Winner')
r <- sub('[^?!.]\\K$', '.', x, perl=T)
## [1] "My name is Ernst." "I love chicken."
## [3] "Hello, my name is Stan!" "Who?"
## [5] "I Love you!" "Winner."
source to share