Add frequency based on repeating line

I have a dataframe, dd

     Var1 Freq
76   2189- 1181
458  6186-  813
445  6170-  738
902  61801  650
74   2181-  618
504  6268-  509
905  61804  307
500  6259-  272
910  61889  265
495  6253-  242
73   2180-  224
510  6256- 6407
461  6180- 3254
792        2333
1          2312
467  6186- 2259
212  4019- 1254
4561 6170- 1162
462  6181- 1156
80   2189- 1154
465  6184- 1035

      

I would like to search for duplicates in a column Var1

and concatenate them, creating the sum of int it columns Freq

in such a way that ...

     Var1 Freq
76   2189- 2335
458  6186- 3072
445  6170- 1900
902  61801  650
74   2181-  618
504  6268-  509
905  61804  307
500  6259-  272
910  61889  265
495  6253-  242
73   2180-  224
510  6256- 6407
461  6180- 3254
792        2333
1          2312
212  4019- 1254
462  6181- 1156
465  6184- 1035

      

NOTE. So you don't need to look for differences, the combined values Var1

were 2189-

, 6186-

and 6170-

.

I guess it can be done with some fancy work with []

and duplicated()

, but I can't wrap my head around it. Any recommendations would be appreciated.

dput

for this data can be found on pastebin .

+3


source to share


1 answer


This can be done simply

aggregate(Freq ~ Var1, dd, sum)

      

Or with data.table

library(data.table)
setDT(dd)[, .(Freq = sum(Freq)), by = Var1]

      

Or using dplyr



library(dplyr)
dd %>%
  group_by(Var1) %>%
  summarise(Freq = sum(Freq))

      

although this will add blank entries as well ""

, which is not clear why you treat them differently

In any case, with data.table

you can achieve your exact result with

setDT(dd)[, if(Var1 == "") .SD else sum(Freq), by = Var1]
#      Var1   V1
#  1: 2189- 2335
#  2: 6186- 3072
#  3: 6170- 1900
#  4: 61801  650
#  5: 2181-  618
#  6: 6268-  509
#  7: 61804  307
#  8: 6259-  272
#  9: 61889  265
# 10: 6253-  242
# 11: 2180-  224
# 12: 6256- 6407
# 13: 6180- 3254
# 14:       2333
# 15:       2312
# 16: 4019- 1254
# 17: 6181- 1156
# 18: 6184- 1035

      

+5


source







All Articles