R: ordering one column conditionally to another and partial order value

I have this retweet data file

set.seed(28100)
    df <- data.frame(user_id = sample(1:8, 10, replace = TRUE),
                 timestamp = sample(1:1000, 10),
                 retweet = sample(999:1002, 10, replace=TRUE))
df <- df[with(df, order(retweet, -timestamp)),]
df
# user_id timestamp retweet
# 6        8       513     999
# 9        7       339     999
# 3        3       977    1000
# 2        3       395    1000
# 5        2       333    1000
# 4        5       793    1001
# 1        3       873    1002
# 8        2       638    1002
# 7        4       223    1002
# 10       6        72    1002

      

Each retweet

has a unique identifier. For each line, I want to rank the user according to the reverse order of the thread or retweets. Rank should evaluate the influence of each user: the longer the chain, the higher the point for an early twitterer. In other words, I want to rank each retweet thread based on timestamp

and give top marks to those who have republished it before. If two users sent the same retweet at the same time, they must assign the same rating.

Or in df

df$ranking <- c(1,2, 1,2,3, 1, 1,2,3,4)
aggregate(ranking~user_id, data=df, sum)

#   user_id ranking
# 1       2       5
# 2       3       4
# 3       4       3
# 4       5       1
# 5       6       4
# 6       7       2
# 7       8       1

      

+3


source to share


1 answer


using datasheet:



library(data.table)
setDT(df)[order(-timestamp), ranking2 := seq_len(.N), by = retweet]
df[, sum(ranking2), keyby = user_id]
#    user_id V1
# 1:       2  5
# 2:       3  4
# 3:       4  3
# 4:       5  1
# 5:       6  4
# 6:       7  2
# 7:       8  1

      

+1


source







All Articles