sapply(strsplit(string, ":"), "[", 2) [1] "E001...">

Sapply () with strsplit in R

I found this code:

string = c("G1:E001", "G2:E002", "G3:E003")
> sapply(strsplit(string, ":"), "[", 2)
[1] "E001" "E002" "E003"

      

clearly strsplit(string, ":")

returns vectors of size 3, where each component of i is a vector of size 2 containing Gi

and E00i

.

But why two more arguments "[", 2

can only those choose E00i

? As far as I can see, the only arguments the function takes are:

sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE) 

      

+3


source to share


3 answers


Look at the docs for ?sapply

:

 sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)

 FUN: the function to be applied to each element of β€˜X’: see
      β€˜Details’.  In the case of functions like β€˜+’, β€˜%*%’, the
      function name must be backquoted or quoted.

 ...: optional arguments to β€˜FUN’.

      

Your answer lies there. In your case FUN

- [

. The "optional arguments in FUN

" are "2" in your case, as it matches in your call ...

. Thus, in this case, it is sapply

called [

with the values ​​in the list as the first argument and the 2

second. Consider:

x <- c("G1", "E001")   # this is the result of `strsplit` on the first value

      



Then:

`[`(x, 2)      # equivalent to x[2]
# [1] "E001"

      

This is what sapply

your example does, except that it applies to every character of the length-by-length character returned strsplit

.

+2


source


You can use sub

to get the expected result instead of usingstrsplit/sapply

 sub('.*:', '', string)
 #[1] "E001" "E002" "E003"

      

As far as your code is concerned, the output strsplit

is a list, and the list can be processed using family functions sapply/lapply/vapply/rapply

, etc. In this case, each item in the list has a length of 2 and we select the second item.

strsplit(string, ":")
#[[1]]
#[1] "G1"   "E001"

#[[2]]
#[1] "G2"   "E002"

#[[3]]
#[1] "G3"   "E003"

lapply(strsplit(string, ":"), `[`, 2)
#[[1]]
#[1] "E001"

#[[2]]
#[1] "E002"

#[[3]]
#[1] "E003"

      



The sapply

default case issimplify=TRUE

 sapply(strsplit(string, ":"), `[`, 2, simplify=FALSE)
#[[1]]
#[1] "E001"

#[[2]]
#[1] "E002"

#[[3]]
#[1] "E003"

      

[

can be replaced with anonymous function call

sapply(strsplit(string, ":"), function(x) x[2], simplify=FALSE)
#[[1]]
#[1] "E001"

#[[2]]
#[1] "E002"

#[[3]]
#[1] "E003"

      

+5


source


Because the output strsplit()

is a list. "[" Refers to list items, and 2 indicates that the second item in the list member is selected. The function sapply()

ensures that this is done for each member of the list. Here [

is a function from sapply()

that is applied to the list strsplit()

and is called with an additional parameter of 2.

> strsplit(string, ":")
#[[1]]
#[1] "G1"   "E001"
#
#[[2]]
#[1] "G2"   "E002"
#
#[[3]]
#[1] "G3"   "E003"
#
> str(strsplit(string, ":"))
#List of 3
# $ : chr [1:2] "G1" "E001"
# $ : chr [1:2] "G2" "E002"
# $ : chr [1:2] "G3" "E003"

      

+2


source







All Articles