# R means coefficient

I have data like this

``````data
name v1  v2  v3  v4  v5
a    1   2   7   9   3
b    3   8   6   4   8
c    2   5   0   1   9
a    6   0   6   2   1
c    3   9   4   7   5
```

```

`name`

is a factor variable. I want to calculate the average by a `v2,v3,v4,v5`

multiplier `data\$name`

. I used the following command, but it didn't work.

``````tapply(data[,3:6],data\$name,mean)
```

```

I have now used the following code

``````newdata<-0
for (name in unique(data\$name)){
rowIndex <- which(data\$name == name)
result <- colMeans(data[rowIndex,])
newdata[name,]<-result
}
```

```

The required result is obtained. But I want to know if there is any neat way to do this.

+3

source to share

Here's another way

``````library(data.table)
cols <- paste0("v", 2:5) # set the columns you want to operate on
setDT(data)[, Sums := rowSums(.SD), .SDcols = cols]
data[, list(Means = sum(Sums)/(.N*length(cols))), by = name]
##    name Means
## 1:    a  3.75
## 2:    b  6.50
## 3:    c  5.00
```

```

Edit

Per @Aruns suggestion would probably be much better

``````setDT(data)[, mean(c(v2,v3,v4,v5)), by=name]
##    name   V1
## 1:    a 3.75
## 2:    b 6.50
## 3:    c 5.00
```

```

Or per @Anandas suggestion

``````library(reshape2)
melt(setDT(data), id.vars = "name", measure.vars = cols)[, mean(value), by = name]
##    name   V1
## 1:    a 3.75
## 2:    b 6.50
## 3:    c 5.00
```

```
+6

source

As per expected result:

t. `The expected result for factor a is a (2+7+9+3)+(0+6+2+1)/8`

``````sapply(split(dat[,-(1:2)], dat\$name), function(x) sum(x)/prod(dim(x)))
#  a    b    c
# 3.75 6.50 5.00
```

```

or

``````tapply(rowMeans(dat[,-(1:2)]), dat[,1], sum)/table(dat[,1])
#a    b    c
#3.75 6.50 5.00
```

```

or

`````` m1 <- as.matrix(dat[,-c(1:2)])
c(by(c(m1), dat[,1][row(m1)], FUN=mean))
#  a    b    c
#3.75 6.50 5.00
```

```

Or the methods suggested by @Ananda Mahto

``````  tapply(unlist(dat[-c(1, 2)]), rep(dat\$name, 4), mean)
#   a    b    c
#3.75 6.50 5.00

tapply(stack(dat, select = paste0("v", 2:5))\$values, rep(dat\$name, 4), mean)
#  a    b    c
#3.75 6.50 5.00
```

```
+4

source

This can be done with a combination of the dplyr and tidyr packages:

``````library(dplyr)
library(tidyr)

data %>% gather(name, value, v2:v5) %>%
group_by(name) %>% summarize(average=mean(value))
#   name average
# 1    a    3.75
# 2    b    6.50
# 3    c    5.00
```

```

This works because it `gather`

combines the columns `v2:v5`

into a single column, where they can be intuitively grouped:

``````data %>% gather(name, value, v2:v5)
#    name v1 name value
# 1     a  1   v2     2
# 2     b  3   v2     8
# 3     c  2   v2     5
# 4     a  6   v2     0
# 5     c  3   v2     9
# 6     a  1   v3     7
# ...
```

```
+3

source

Edit: The original answer didn't give the correct result. This seems to work fine (select (-variable) avoids having an extra column, but not required otherwise)

Using the dplyr and reshape2 packages:

``````library(reshape2)
library(dplyr)
data %>%
select(-v1) %>%
melt %>%
group_by(name) %>%
select(-variable) %>%
summarise_each(funs(mean))
# Source: local data frame [3 x 2]
#
#   name value
# 1    a  3.75
# 2    b  6.50
# 3    c  5.00
```

```
+2

source

All Articles