How to find the top N descending values ββin a group in dplyr
I have the following dataframe in R
Serivce Codes
ABS RT
ABS RT
ABS TY
ABS DR
ABS DR
ABS DR
ABS DR
DEF RT
DEF RT
DEF TY
DEF DR
DEF DR
DEF DR
DEF DR
DEF TY
DEF SE
DEF SE
What I want is to count the service code in descending order.
Serivce Codes Count
ABS DR 4
ABS RT 2
ABS TY 1
DEF DR 4
DEF RT 2
DEF TY 2
I am doing the following in r
df%>%
group_by(Service,Codes) %>%
summarise(Count = n()) %>%
top_n(n=3,wt = Count) %>%
arrange(desc(Count)) %>%
as.data.frame()
But that doesn't give me what is intended.
source to share
We can try with count/arrange/slice
df1 %>%
count(Service, Codes) %>%
arrange(desc(n)) %>%
group_by(Service) %>%
slice(seq_len(3))
# A tibble: 6 x 3
# Groups: Service [2]
# Service Codes n
# <chr> <chr> <int>
#1 ABS DR 4
#2 ABS RT 2
#3 ABS TY 1
#4 DEF DR 4
#5 DEF RT 2
#6 DEF SE 2
In the OP code, we arrange
also need "Service". As @Marius said in the comments, top_n
will contain more lines if there are links. One option is to do the second grouping with "Tools" and slice
(as shown above) or after grouping, we canfilter
df1 %>%
group_by(Service,Codes) %>%
summarise(Count = n()) %>%
top_n(n=3,wt = Count) %>%
arrange(Service, desc(Count)) %>%
group_by(Service) %>%
filter(row_number() <=3)
source to share
In an R base, you can do this in two lines.
# get data.frame of counts by service-code pairs
mydf <- data.frame(table(dat))
# get top 3 by service
do.call(rbind, lapply(split(mydf, mydf$Serivce), function(x) x[order(-x$Freq)[1:3],]))
This returns
Serivce Codes Freq
ABS.1 ABS DR 4
ABS.3 ABS RT 2
ABS.7 ABS TY 1
DEF.2 DEF DR 4
DEF.4 DEF RT 2
DEF.6 DEF SE 2
In the first line, use table
to get the counters and then convert to data.frame. The second line splits by service, order negative values order
and pull out the first three items. Combine results with do.call
.
source to share