How to add "." after line in conditions in R

Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love    you!","Winner")

      

The function should add "." if there is none of these characters [.?!] at the end of the sentence to end the sentence.

I tried to create a function in R using Regex, but I had some problems to only look at the end of the line.

+3


source to share


4 answers


The function below gsub

will add a period at the end of the sentence only if the sentence does not end with .

either ?

or or !

.

> Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love    you!","Winner")
> gsub("^(?!.*[.?!]$)(.*)$", "\\1.", Data, perl=TRUE)
[1] "My name is Ernst."       "I love chicken."        
[3] "Hello, my name is Stan!" "Who?"                   
[5] "I Love    you!"          "Winner."

      

In regular expression, images are used to check the state. A negative lookhhead (?!.*[.?!]$)

will check for .

either ?

or !

at the end of a string. If it is present for the last time, then it skips the clause and no replacement will occur on that matching line. The replacement will only occur if there are no .

or ?

or characters left at the end !

.



OR

Through negative lookbehind and positive look,

> Data <- c("My name is Ernst.","I love chicken","Hello, my name is Stan!","Who?","I Love    you!","Winner")
> sub("(?<![!?.])(?=$)", ".", Data, perl=TRUE)
[1] "My name is Ernst."       "I love chicken."        
[3] "Hello, my name is Stan!" "Who?"                   
[5] "I Love    you!"          "Winner." 

      

+3


source


through stringi



library(stringi) 
stri_replace_all_regex(Data, "(?<![^!?.])\\b$", ".")
#[1] "My name is Ernst."       "I love chicken."        
#[3] "Hello, my name is Stan!" "Who?"                   
#[5] "I Love    you!"          "Winner." 

      

+2


source


Here are some possible approaches:

1) If the last character is not a dot ,? or! then replace it with this character followed by a dot:

sub("([^.!?])$", "\\1.", Data)

      

For the data in the question, we get:

[1] "My name is Ernst."       "I love chicken."        
[3] "Hello, my name is Stan!" "Who?"                   
[5] "I Love    you!"          "Winner."   

      

2) The gsubfn solution is even simpler. It replaces the empty () with a dot if the last character is not a dot ,! or?.

library(gsubfn)
gsubfn("[^.!?]()$", ".", Data)

      

3) This is used grepl

. If point,! or? this is the last character, then add a blank line and otherwise add a period.

paste0(Data, ifelse(grepl("[.!?]$", Data), "", "."))

      

4) It doesn't use regular expressions at all. He chooses the last character and, if his point,! or? it adds an empty line and otherwise adds a period:

paste0(Data, ifelse(substring(Data, nchar(Data)) %in% c(".", "!", "?"), "", "."))

      

+2


source


Here's another solution.

x <- c('My name is Ernst.', 'I love chicken', 
       'Hello, my name is Stan!', 'Who?', 'I Love    you!', 'Winner')
r <- sub('[^?!.]\\K$', '.', x, perl=T)
## [1] "My name is Ernst."       "I love chicken."        
## [3] "Hello, my name is Stan!" "Who?"                   
## [5] "I Love    you!"          "Winner."   

      

+2


source







All Articles