How can I get the column names from a sorted column portion of a matrix?
I have the following in R:
> require("pls")
> set.seed(42)
> a = matrix(rnorm(12), ncol=4)
> a
[,1] [,2] [,3] [,4]
[1,] 1.3709584 0.6328626 1.51152200 -0.0627141
[2,] -0.5646982 0.4042683 -0.09465904 1.3048697
[3,] 0.3631284 -0.1061245 2.01842371 2.2866454
> b <- c(1, 2, 3)
> mymodel <- pcr(b ~ a)
> mymodel$loadings
Loadings:
Comp 1 Comp 2
a1 0.654 0.165
a2 0.136 -0.255
a3 0.415 0.732
a4 -0.618 0.609
> comp1 <- mymodel$loadings[, 1]
> comp1
a1 a2 a3 a4
0.6539036 0.1362937 0.4146222 -0.6179988
> sort(comp1, decreasing=TRUE)
a1 a3 a2 a4
0.6539036 0.4146222 0.1362937 -0.6179988
> sort(comp1, decreasing=TRUE)[1]
a1
0.6539036
I'm really puzzled as to what comp1 is:
> colnames(comp1)
NULL
> rownames(comp1)
NULL
> dim(comp1)
NULL
> str(comp1)
Named num [1:4] 0.654 0.136 0.415 -0.618
- attr(*, "names")= chr [1:4] "a1" "a2" "a3" "a4"
> typeof(comp1)
[1] "double"
Questions:
- What data structure is comp1?
- What should I do to get the column names from the sorted comp1 to get
"a1", "a3", "a2", "a4"
?
source to share
Good question in many ways, as these small but very important differences cause many new and even intermediate R users. In this case, your call pcr
returns a named list mymodel
. Your question showed that you are using a method str()
to validate the returned objects, and this is always the best place to start. You can also refer to the help page ?pcr
, which explains that the return is a list of named components plus "all components returned by the underlying fitting function." For a pcr()
basic fit function svdpc.fit
, and the reference page for that function describes the remaining items in the list my model
.
As you can see below, it mymodel$loadings
is a numeric vector with two named dimensions (via a list dimnames
), also known as a matrix. Your use of the operator [
to slice off the first column (in my code called "Comp 1" as it is clearer and less likely to break if the order of the columns ever changes) returns a simple numeric vector because you have only selected one from the matrix column. You can tell that it is not a matrix because it has no dimensions, only length.
> str(mymodel$loadings)
loadings [1:4, 1:2] 0.654 0.136 0.415 -0.618 0.165 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:4] "a1" "a2" "a3" "a4"
..$ : chr [1:2] "Comp 1" "Comp 2"
> is.matrix(mymodel$loadings)
[1] TRUE
> str(mymodel$loadings[, "Comp 1"])
Named num [1:4] 0.654 0.136 0.415 -0.618
- attr(*, "names")= chr [1:4] "a1" "a2" "a3" "a4"
> dim(mymodel$loadings[, "Comp 1"])
NULL
The way R downsizes when retrieving only one column is not always the behavior you prefer, especially if you want to write generic code that returns the same type of object. One way that colnames()
will work for you is to use an argument drop = FALSE
in [
. (See ?"["
). This saves the matrix with its dimensions and dimnames:
> altslice <- mymodel$loadings[, "Comp 1", drop = FALSE]
> colnames(altslice)
[1] "Comp 1"
> dim(altslice)
[1] 4 1
> is.matrix(altslice)
[1] TRUE
source to share