R applies: loses name names
It's a pretty simple question that I'm surprised I can't find a link to anyone who asked it before. It is not the same as this and is not covered by this discussion .
I have a 4-bit matrix (dimensions 16x10x15x39) with named dimnames (this is what happens when you are cast
with a data frame like csv. You get names using dimnames with names(dimnames(matrix))
)
Then I want to replace the columns (i.e. the first dimension) with the subtotal row fractions, so I do this:
matrix2 <- apply(matrix1, c(2,3,4), function(x){x/sum(x)})
But now names(dimnames(matrix2))
empty for the first dimension. The rest of the names are preserved.
So: how can I run apply
over a matrix with named dimnames and keep the names of all other dimensions?
Reproducible example
Here's a simple example of a problem. Just run all the code and look at the last two lines.
x <- data.frame(
name=c("bob","james","sarah","bob","james",
"sarah","bob","james","sarah","bob",
"james","sarah"),
year=c("1995","1995","1995","1995","1995",
"1995","2005","2005","2005","2005",
"2005","2005"),
sample_num=c("sample1","sample1","sample1",
"sample2","sample2","sample2",
"sample1","sample1","sample1",
"sample2","sample2","sample2"),
value=c(1,2,3,2,3,4,1,2,3,2,3,4)
)
x <- cast(x, sample_num ~ name ~ year)
x_fractions <- apply(y,c(2,3),function(x){x / sum(x)})
names(dimnames(x))
names(dimnames(x_fractions))
source to share
I'm not really sure what you are looking for, but I think the function sweep
fits your purpose. Try:
result <- sweep(test, c(2,3,4), colSums(test), FUN='/')
Where test
is the array created by @ user2068776. dimnames
persist.
dimnames(result)
$a
[1] "a1" "a2"
$b
[1] "b1" "b2"
$c
[1] "c1" "c2"
$d
[1] "d1" "d2"
source to share
It is really an ambiguous answer to a question with no reproducible example. I am answering this question because it is an interesting example for an example.
dat <- array(rnorm(16*10*15*39))
dim(dat) <- c(16,10,15,39)
dimnames(dat) <- lapply(c(16,10,15,39),
function(x) paste('a',sample(1:1000,x,rep=F),sep=''))
dat2 <- apply(dat, c(2,3,4), function(x){x/sum(x)})
identical(dimnames(dat2) ,dimnames(dat))
[1] TRUE
I am getting the same dimanmes for dat and dat2. So, of course, I missed something.
source to share
I cannot reproduce this behavior using a non-folded array. Could you provide a reproducible example of what your file / dataset looks like? Otherwise, it is difficult to determine what the problem is.
Here's the code I used for testing:
# example
a <- c(1,2,11,22)
b <- c(3,4,33,44)
c <- c(5,6,55,66)
d <- c(7,8,77,88)
test <- array(c(a,b,c,d), c(2,2,2,2), dimnames=list(
a=c("a1","a2"),
b=c("b1","b2"),
c=c("c1","c2"),
d=c("d1","d2") )
)
dimnames(test)
names(dimnames(test))
# apply
test2 <- apply(test, c(2,3,4), function(x){
entry <- sum(x)
})
dimnames(test2)
names(dimnames(test2))
Sorry for commenting "diguised" as an answer. I'm new to SO and you think you need a higher post for comments.
Edit: yours dimnames
might get lost because, for whatever reason, the function you defined is throwing unnamed results. You can try saving x/sum(x)
as an object (like me) and then calling that object inside your function. I missed the last part because there were no missing names
/dimnames
source to share