Sapply () with strsplit in R
I found this code:
string = c("G1:E001", "G2:E002", "G3:E003")
> sapply(strsplit(string, ":"), "[", 2)
[1] "E001" "E002" "E003"
clearly strsplit(string, ":")
returns vectors of size 3, where each component of i is a vector of size 2 containing Gi
and E00i
.
But why two more arguments "[", 2
can only those choose E00i
? As far as I can see, the only arguments the function takes are:
sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
source to share
Look at the docs for ?sapply
:
sapply(X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE)
FUN: the function to be applied to each element of βXβ: see
βDetailsβ. In the case of functions like β+β, β%*%β, the
function name must be backquoted or quoted.
...: optional arguments to βFUNβ.
Your answer lies there. In your case FUN
- [
. The "optional arguments in FUN
" are "2" in your case, as it matches in your call ...
. Thus, in this case, it is sapply
called [
with the values ββin the list as the first argument and the 2
second. Consider:
x <- c("G1", "E001") # this is the result of `strsplit` on the first value
Then:
`[`(x, 2) # equivalent to x[2]
# [1] "E001"
This is what sapply
your example does, except that it applies to every character of the length-by-length character returned strsplit
.
source to share
You can use sub
to get the expected result instead of usingstrsplit/sapply
sub('.*:', '', string)
#[1] "E001" "E002" "E003"
As far as your code is concerned, the output strsplit
is a list, and the list can be processed using family functions sapply/lapply/vapply/rapply
, etc. In this case, each item in the list has a length of 2 and we select the second item.
strsplit(string, ":")
#[[1]]
#[1] "G1" "E001"
#[[2]]
#[1] "G2" "E002"
#[[3]]
#[1] "G3" "E003"
lapply(strsplit(string, ":"), `[`, 2)
#[[1]]
#[1] "E001"
#[[2]]
#[1] "E002"
#[[3]]
#[1] "E003"
The sapply
default case issimplify=TRUE
sapply(strsplit(string, ":"), `[`, 2, simplify=FALSE)
#[[1]]
#[1] "E001"
#[[2]]
#[1] "E002"
#[[3]]
#[1] "E003"
[
can be replaced with anonymous function call
sapply(strsplit(string, ":"), function(x) x[2], simplify=FALSE)
#[[1]]
#[1] "E001"
#[[2]]
#[1] "E002"
#[[3]]
#[1] "E003"
source to share
Because the output strsplit()
is a list. "[" Refers to list items, and 2 indicates that the second item in the list member is selected. The function sapply()
ensures that this is done for each member of the list. Here [
is a function from sapply()
that is applied to the list strsplit()
and is called with an additional parameter of 2.
> strsplit(string, ":")
#[[1]]
#[1] "G1" "E001"
#
#[[2]]
#[1] "G2" "E002"
#
#[[3]]
#[1] "G3" "E003"
#
> str(strsplit(string, ":"))
#List of 3
# $ : chr [1:2] "G1" "E001"
# $ : chr [1:2] "G2" "E002"
# $ : chr [1:2] "G3" "E003"
source to share