R: Calculate the average of a variable over the unique values ​​of another variable in a data frame?

I'm new to R. I have a dataframe that looks like this:

Pupil ID     State      GPA
1            FL         3.9
2            TX         3.2
3            NY         2.2
4            AK         3.0
5            CO         2.4


... etc. I would like to create a new dataframe that looks like this:

State        Mean GPA     Number of pupils 
AL           2.91         23
AK           3.23         24


etc. In other words, I would like to find the unique values ​​for the state and calculate the average GPA for each one and the number of students for each one.

Is this possible in R? I know what I can do table(data$State)

to get the unique states and student count, but I don't know how to calculate the average for the unique state values.


source to share

2 answers

One of so many ways to do this:

x <- read.table(header=T, text="Pupil.ID     State      GPA
1            FL         3.9
2            TX         3.2
3            NY         2.2
4            AK         3.0
5            CO         2.4")

aggregate(GPA~State, data=x, FUN=function(x) c(mean=mean(x), count=length(x)))
##   State GPA.mean GPA.count
## 1    AK      3.0       1.0
## 2    CO      2.4       1.0
## 3    FL      3.9       1.0
## 4    NY      2.2       1.0
## 5    TX      3.2       1.0




The best way to do this is to use it group_by()

in conjunction with the summarise()

dplyr package. If df is your dataframe,

df %>%
   group_by(State) %>%
   summarise(mean_GPA = mean(GPA),
             number_of_pupils = n())


will give you the GPA for each unique state, as well as the student count (row count).



All Articles