Is it possible to invert a template argument in ls ()?

I'm trying to get a vector of all the function names in the base package that contain only .

punctuation marks or no punctuation marks at all. I would like to do this using just a function ls()

.

ls()

takes an argument pattern

, which is defined as

optional regular expression. Only the name matching pattern is returned. glob2rx

can be used to convert wildcard patterns to regular expressions.

I am trying to invert my regex. But I also want to keep the functions containing .

. Here's an example of some of the ones I don't need.

lsBase1 <- ls("package:base", pattern = "[[:punct:]]")
head(lsBase1)
# [1] "^"   "~"   "<"   "<<-" "<="  "<-" 

      

I want an inverted version of this, as if I were using invert = TRUE

in grep

, or by doing the following. But I also want the functions to only .

contain if they contain punctuation marks.

lsBase2 <- ls("package:base")
lsBase2 <- lsBase[!grepl("[[:punct:]]", lsBase)]
head(lsBase2)
# [1] "abbreviate"      "abs"             "acos"            "acosh"          
# [5] "addNA"           "addTaskCallback"

      

Is there a way to invert the argument pattern

to ls()

? Or, more generally, can I invert the regex [[:punct:]]

so that it returns the opposite, but includes those matches that contain only .

punctuation marks?

Note. More than one .

is ok.

Another example of what I want: Yes, I want is.vector

, but I don't [.data.frame

.

+3


source to share


3 answers


I believe this is what you are looking for:

m <- ls("package:base", pattern="^(\\.|[^[:punct:]])*$")

      

|

is a regular expression for "OR", so in words it says something like "match a sequence of characters from the beginning of a string to the end of the string, each of which is either .

OR not a punctuation character."




To confirm that this works:

## Dissolve the matched strings and check for any verboten characters.  
sort(unique(unlist(strsplit(m, ""))))
#  [1] "." "0" "1" "2" "3" "4" "8" "a" "A" "b" "B" "c" "C" "d" "D" "e"
# [17] "E" "f" "F" "g" "G" "h" "H" "i" "I" "j" "J" "k" "K" "l" "L" "m"
# [33] "M" "n" "N" "o" "O" "p" "P" "q" "Q" "r" "R" "s" "S" "t" "T" "u"
# [49] "U" "v" "V" "w" "W" "x" "X" "y" "Y" "z"

## Have a look at (at least a few of) the names _excluded_ by the regex:
n <- setdiff(ls("package:base"), m)
sample(n, 10)
# [1] "names<-.POSIXlt" "[[<-.data.frame" "!.hexmode"       "$<-"            
# [5] "<-"              "&&"              "%*%"             "package_version"
# [9] "$"               "regmatches<-"   

      

+5


source


The following will work for what you are asking.

> lsBase2[grepl('^([^\\pP\\pS]|\\.)+$', lsBase2, perl=T)]

      



Edit: Or you can just use the following (R version 3.1.1) returns 1029

results on this matter:

> ls("package:base", pattern="^[a-zA-Z0-9.]+$")

      

+3


source


It's easy when you think about it step by step. Remove characters first .

, then scan for extra punctuation:

lsBase2[!grepl('[[:punct:]]', gsub('[.]', '', lsBase2))]

      

0


source







All Articles