How do I create a new list of object attributes from a list of those objects?
I am working with git2r
and want to create some basic statistics about project activities.
git2r
returns all commits as a list of S4 objects. Below I will show the structure of the first object:
> library(git2r)
> repo <- repository('/Users/swain/Dropbox/projects/from-github/brakeman')
> last3 <- commits(repo, n=3)
> str(last3)
List of 3
$ :Formal class 'git_commit' [package "git2r"] with 6 slots
.. ..@ sha : chr "f7746c21846d895bd90632df5a2366381ced77d9"
.. ..@ author :Formal class 'git_signature' [package "git2r"] with 3 slots
.. .. .. ..@ name : chr "Justin"
.. .. .. ..@ email: chr "presidentbeef@users.noreply.github.com"
.. .. .. ..@ when :Formal class 'git_time' [package "git2r"] with 2 slots
.. .. .. .. .. ..@ time : num 1.5e+09
.. .. .. .. .. ..@ offset: num -420
.. ..@ committer:Formal class 'git_signature' [package "git2r"] with 3 slots
.. .. .. ..@ name : chr "GitHub"
.. .. .. ..@ email: chr "noreply@github.com"
.. .. .. ..@ when :Formal class 'git_time' [package "git2r"] with 2 slots
.. .. .. .. .. ..@ time : num 1.5e+09
.. .. .. .. .. ..@ offset: num -420
.. ..@ summary : chr "Merge pull request #1056 from presidentbeef/hash_access_interpolation_performance_improvements"
.. ..@ message : chr "Merge pull request #1056 from presidentbeef/hash_access_interpolation_performance_improvements\n\nHash access i"| __truncated__
.. ..@ repo :Formal class 'git_repository' [package "git2r"] with 1 slot
.. .. .. ..@ path: chr "/Users/swain/Dropbox/projects/from-github/brakeman"
I searched high and low to extract one slot from all objects in the list. For example, for all S4 objects in the list, last3
I want to pull author
into this new list. Note that there are nested objects here, so I can make a list of something on the object that is in the slot of the top object.
Ultimately I want to start creating plots and summaries of various fields. For example, a histogram is captured by day of the week; field charts of message length using a committer; such things. Is converting slots to lists or vectors the wrong way? (change: s / histogram / histogram /, doh)
source to share
Here's a tidyverse solution you are trying to achieve. Jenny Brian has a good set of introductory docs on how to use purrr (and other packages) for this kind of task: https://jennybc.github.io/purrr-tutorial/ .
library(git2r)
library(dplyr)
library(ggplot2)
library(purrr)
library(lubridate)
options(stringsAsFactors = FALSE)
repo <- repository("/git-repos/brakeman/")
# Get relevant bits out of the list
analysis_df <-
repo %>%
commits(n = 50) %>%
map_df(
~ data.frame(
name = .@author@name,
date = .@author@when@time %>% as.POSIXct(origin="1970-01-01"),
message = .@message
)
)
# A histogram of commits by day of the week;
analysis_df %>%
mutate(weekday = weekdays(date)) %>%
group_by(weekday) %>%
tally() %>%
ggplot(aes(x = weekday, y = n)) +
geom_bar(stat = "identity")
# box plots of the message length by committer
analysis_df %>%
mutate(message_length = nchar(message)) %>%
group_by(name) %>%
summarise(mean_message_length = mean(message_length)) %>%
ggplot(aes(x = name, y = mean_message_length)) +
geom_bar(stat = "identity")
source to share