U_REGEX_INVALID_CAPTURE_GROUP_NAME error occurs when trying to escape regular characters only on Windows

I recently implemented a function for the escape characters interpreted as regex that go into the system call of my R package 'rNOMADS'

SanitizeWGrib2Inputs <- function(check.strs) {
    #Escape regex characters before inputting to wgrib2
    #INPUTS
    #    CHECK.STRS - Strings possibly containing regex metacharacters
    #OUTPUTS
    #    CHECKED.STRS - Strings with metacharacters appropriately escaped

    meta.chars <- paste0("\\", c("(", ")", ".", "+", "*", "^", "$", "?", "[", "]", "|"))

   for(k in 1:length(meta.chars)) {
       check.strs <- stringr::str_replace(check.strs, meta.chars[k], paste0("\\\\", meta.chars[k]))
   }

   checked.strs <- check.strs

   return(checked.strs)
}

      

and I am including an example in my package documentation:

check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
checked.strs <- SanitizeWGrib2Inputs(check.strs) 

      

This works fine on my ubuntu machine and passes CRAN checks. However, when I uploaded the package to CRAN, their window checker said:

> check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
> checked.strs <- SanitizeWGrib2Inputs(check.strs) Error in stri_replace_first_regex(string, pattern, fix_replacement(replacement),  :    Invalid capture group name. (U_REGEX_INVALID_CAPTURE_GROUP_NAME) Calls: SanitizeWGrib2Inputs -> <Anonymous> -> stri_replace_first_regex -> .Call Execution halted
** running examples for arch 'x64' ... ERROR Running examples in 'rNOMADS-Ex.R' failed The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: SanitizeWGrib2Inputs
> ### Title: Make sure regex metacharacters are properly escaped
> ### Aliases: SanitizeWGrib2Inputs
> ### Keywords: internal
> 
> ### ** Examples
> 
> 
> check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
> checked.strs <- SanitizeWGrib2Inputs(check.strs) Error in stri_replace_first_regex(string, pattern, fix_replacement(replacement),  :    Invalid capture group name. (U_REGEX_INVALID_CAPTURE_GROUP_NAME) Calls: SanitizeWGrib2Inputs -> <Anonymous> -> stri_replace_first_regex -> .Call Execution halted

      

I have tested this behavior on my windows partition. What's the job for this?

+3


source to share


1 answer


Use simple str_replace_all

to escape all the special regex metacharacters:

SanitizeWGrib2Inputs <- function(check.strs) {
    return(str_replace_all(check.strs, "[{\\[()|?$^*+.\\\\]", "\\$0"))
}
check.strs <- c("frank", "je.rry", "har\\old", "Johnny Ca$h")
checked.strs <- SanitizeWGrib2Inputs(check.strs) 
checked.strs
## => [1] "frank"         "je\\.rry"      "har\\\\old"    "Johnny Ca\\$h"

      



Notes

  • "[{\\[\\]()|?$^*+.\\\\]

    (actually "[{\[\]()|?$^*+.\\]

    ) will match any single char, or {

    , [

    , ]

    , (

    , )

    , |

    , ?

    , $

    , ^

    , *

    , +

    , .

    or\

  • The replacement "\\$0"

    changes each of the characters to \

    + the same char ( $0

    is a back reference to the entire match value).
  • I don't think you need to add ]

    here, since outside of a character class it is not a special character unless there is a pairing opening in front of it [

    .
+2


source







All Articles