Recursively manipulate employee and manager data to create an organization tree hierarchy in R

I usually analyze data in an "organizational tree" format to understand the frequency of actions under a particular leader in an organization. I need to create a wide hierarchy with two columns of data: employee name and manager name.

----------
df <- data.frame("Employee"=c("Bill","James","Amy","Jen","Henry"),
                      "Supervisor"=c("Jen","Jen","Steve","Amy","Amy"))
df
#   Employee Supervisor
# 1     Bill        Jen
# 2    James        Jen
# 3      Amy      Steve
# 4      Jen        Amy
# 5    Henry        Amy

      

End the wide data frame that defines the organization chart, starting with the CEO (or tallest employee):

#  Employee       H1     H2    H3
# 1    Bill    Steve    Amy   Jen
# 2   James    Steve    Amy   Jen
# 3     Amy    Steve     NA    NA
# 4     Jen    Steve    Amy    NA
# 5   Henry    Steve    Amy    NA

      

After much research, the package data.tree

offers the most help. How can I accomplish this operation?

+3


source to share


2 answers


Try the following:

library(data.table)
setDT(df)

setnames(df, 'Supervisor', 'Supervisor.1')

j=1
while (df[, any(get(paste0('Supervisor.',j)) %in% Employee)]) {
  df[df, on=paste0('Supervisor.',j,'==Employee'),
     paste0('Supervisor.',j+1):= i.Supervisor.1]
  j = j + 1
}

> df
#    Employee Supervisor.1 Supervisor.2 Supervisor.3
# 1:     Bill          Jen          Amy        Steve
# 2:    James          Jen          Amy        Steve
# 3:      Amy        Steve           NA           NA
# 4:      Jen          Amy        Steve           NA
# 5:    Henry          Amy        Steve           NA

      



To change the order in lines:

df = cbind(df[, 1], t(apply(df[, -1], 1, function(r) c(rev(r[!is.na(r)]), r[is.na(r)]))))
> df
#    Employee    V1  V2  V3
# 1:     Bill Steve Amy Jen
# 2:    James Steve Amy Jen
# 3:      Amy Steve  NA  NA
# 4:      Jen Steve Amy  NA
# 5:    Henry Steve Amy  NA

      

+2


source


If you don't insist on exiting, but want to work with hierarchy, then data.tree is a great choice. Here are some examples:

libary(data.tree)
df <- data.frame("Employee"=c("Bill","James","Amy","Jen","Henry"),
                 "Supervisor"=c("Jen","Jen","Steve","Amy","Amy"))

dt <- FromDataFrameNetwork(df)

#here your org chart:

print(dt)

      

Let's find Jennas' subordinates along with their level in the hierarchy:

Get(FindNode(dt, 'Jen')$leaves, 'level')

      

It will return like this:

 Bill James 
    4     4 

      

Just for fun, add a staff budget:



dt$Set(salary = c(100000, 80000, 60000, 40000, 35000, 70000))

      

Salary and total salary stamp

print(dt, 'salary', sal_subordinates = function(node) Aggregate(node, 'salary', sum))

      

It will print like this:

          levelName salary sal_subordinates
1 Steve             100000            80000
2  Β°--Amy            80000           130000
3      Β¦--Jen        60000            75000
4      Β¦   Β¦--Bill   40000            40000
5      Β¦   Β°--James  35000            35000
6      Β°--Henry      70000            70000

      

The data.tree vignettes file contains many more examples of working with hierarchical data and aggregation.

+1


source







All Articles