R: convert hidden table to matrix
I have a table that looks like this:
Row Col Value
1 1 31
1 2 56
1 8 13
2 1 83
2 2 51
2 9 16
3 2 53
I need to convert this table to matrix (column Row
represents rows and column represents Col
columns). For output like this:
1 2 3 4 5 6 7 8 9
1 31 56 NA NA NA NA NA 13 NA
2 81 51 NA NA NA NA NA NA 16
3 NA 53 NA NA NA NA NA NA NA
I believe there is a quick way to do what I want as my solution would be looping for every row / column combination and cbind
that's it.
Reproducible example:
require(data.table)
myTable <- data.table(
Row = c(1,1,1,2,2,2,3),
Col = c(1,2,8,1,2,9,1),
Value = c(31,56,13,83,51,16,53))
source to share
xtabs
in base R is perfect for this if you can live from "0" wherever you have NA
.
This would be the basic approach:
xtabs(Value ~ Row + Col, myTable)
# Col
# Row 1 2 8 9
# 1 31 56 13 0
# 2 83 51 0 16
# 3 53 0 0 0
However, this does not fill in the gaps because not all factor levels are available. You can do this separately or on the fly, for example:
xtabs(Value ~ factor(Row, sequence(max(Row))) +
factor(Col, sequence(max(Col))), myTable)
# factor(Col, sequence(max(Col)))
# factor(Row, sequence(max(Row))) 1 2 3 4 5 6 7 8 9
# 1 31 56 0 0 0 0 0 13 0
# 2 83 51 0 0 0 0 0 0 16
# 3 53 0 0 0 0 0 0 0 0
In addition, this means that if the "Row" and "Col" values are factors, this dcast.data.table
should also work:
dcast.data.table(myTable, Row ~ Col, value.var = "Value", drop = FALSE)
(But this is not in my test for some reason. I had to do library(reshape2); dcast(myTable, Row ~ Col, value.var = "Value", drop = FALSE)
to get it to work, thus not using the speed of the "data.table".)
source to share
Rare matrix. You probably want a small matrix
require(Matrix) # doesn't require installation
mySmat <- with(myTable,sparseMatrix(Row,Col,x=Value))
which gives
3 x 9 sparse Matrix of class "dgCMatrix"
[1,] 31 56 . . . . . 13 .
[2,] 83 51 . . . . . . 16
[3,] 53 . . . . . . . .
Matrix. If you really want a matrix
-class object with NA
s, there
myMat <- as.matrix(mySmat)
myMat[myMat==0] <- NA
which gives
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 31 56 NA NA NA NA NA 13 NA
[2,] 83 51 NA NA NA NA NA NA 16
[3,] 53 NA NA NA NA NA NA NA NA
efficiency considerations. For shorter code:
myMat <- with(myTable,as.matrix(sparseMatrix(Row,Col,x=Value)))
myMat[myMat==0] <- NA
For more speed (but slower than sparse matrix creation), initialize before NA
and then fill like @jimmyb and @bgoldst:
myMat <- with(myTable,matrix(,max(Row),max(Col)))
myMat[cbind(myTable$Row,myTable$Col)] <- myTable$Value
This workaround is only necessary if you insist NA
on zeros. A sparse matrix is almost what you should be using. Creation and work with it should be faster; and its storage should be less memory intensive.
source to share
I find the most concise and efficient way to achieve this is to preallocate the matrix using NA and then assign the vector slice by manually calculating the linear indices from Row
and Col
:
df <- data.frame(Row=c(1,1,1,2,2,2,3), Col=c(1,2,8,1,2,9,2), Value=c(31,56,13,83,51,16,53) );
m <- matrix(NA,max(df$Row),max(df$Col));
m[(df$Col-1)*nrow(m)+df$Row] <- df$Value;
m;
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 31 56 NA NA NA NA NA 13 NA
## [2,] 83 51 NA NA NA NA NA NA 16
## [3,] NA 53 NA NA NA NA NA NA NA
source to share