How to add uneven length of named vectors to R

It should be easy, but I can't seem to get it to work. I have two named vectors of unequal length:

x <- as.vector(c(5, 10,15,20))
names(x) <- c("A", "B", "C", "D")
y <- as.vector(c(7, 12))
names(y) <- c("A", "D")

      

I want to add these and keep the naming convention of the longest. I would like x + y to give:

A   B   C   D
12  10  15  32

      

I've tried making the lengths equal, as suggested elsewhere, and that allows arithmetic, but doesn't preserve the naming convention. I've also tried things like:

z <- x[names(y)] + y

      

but that gives me arithmetic but doesn't preserve structure.

+3


source to share


3 answers


You can use replace

to do this by matching the names of two vectors:

x + replace(rep(0, length(x)), names(x) %in% names(y), y)
#  A  B  C  D 
# 12 10 15 32 

      

The names x

and y

are assumed to be in the same order.

Alternatively, you can do something like that doesn't require the same ordering:



z <- x
m <- match(names(y), names(x))
z[m] <- z[m] + y
z
#  A  B  C  D 
# 12 10 15 32 

      

While none of them are as concise as @RichardScriven's suggestion for usage tapply

, they are much more efficient on large vectors:

set.seed(144)
big.x <- runif(1000000)
names(big.x) <- paste("x", 1:1000000)
big.y <- big.x[sort(sample(1:1000000, 500000))]
sum.rscriven <- function(x, y) {
  z <- c(x, y)
  tapply(z, names(z), sum)
}
sum.josilber1 <- function(x, y) x + replace(rep(0, length(x)), names(x) %in% names(y), y)
sum.josilber2 <- function(x, y) {
  z <- x
  m <- match(names(y), names(x))
  z[m] <- z[m] + y
  z
}
system.time(sum.rscriven(big.x, big.y))
#    user  system elapsed 
#  12.650   0.151  12.817 
system.time(sum.josilber1(big.x, big.y))
#    user  system elapsed 
#   0.214   0.002   0.215 
system.time(sum.josilber2(big.x, big.y))
#    user  system elapsed 
#   0.180   0.003   0.182 

      

Note that both proposed solutions are at least 50 times faster than the solution tapply

in this example ( big.x

1 million big.y

long , 500k long) because they perform one vector's complement instead of many smaller calls sum

.

+2


source


you can use tapply()



z <- c(x, y)
tapply(z, names(z), sum)
#  A  B  C  D 
# 12 10 15 32 

      

+5


source


One using length<-

andcolSums

colSums(rbind(x, `length<-`(y, length(x))[names(x)]), na.rm=T)
#  A  B  C  D 
# 12 10 15 32 

      

Another using merge

andcolSums

colSums(merge(as.list(x), as.list(y), all=T), na.rm=T)
#  A  D  B  C 
# 12 32 10 15 

      

+1


source







All Articles