Mark duplicates recursively in data.frame
Given the test dataset,
dat=data.frame(name=c('A','A','B','C','C','C'),val=c(1,1,2,2,3,2))
name val
A 1
A 1
B 2
C 2
C 3
C 2
What would be the most efficient way to get this conclusion
name val
A 1
A-1 1
B 2
C 2
C-1 3
C-2 2
So, just mark the duplicates with a custom id. I could think of marking them with a common id with paste(dat[which(duplicated(dat$name)),1],"-1",sep='')
, but that would just put "-1" in front of all duplicates. I want if the item appears a third time, mark it with "-2" etc.
Greetings
+3
source to share
3 answers
This is not exactly what you asked, but you can try this:
within(dat, {
Name <- paste(name, as.numeric(ave(as.character(name),
name, FUN = seq_along)) - 1,
sep = "-")
rm(name)
})
# val Name
# 1 1 A-0
# 2 1 A-1
# 3 2 B-0
# 4 2 C-0
# 5 3 C-1
# 6 2 C-2
Or with a slight modification:
within(dat, {
name <- as.character(name)
Name <- as.numeric(ave(name, name, FUN = seq_along)) - 1
Name <- ifelse(Name == 0, name, paste(name, Name, sep = "-"))
rm(name)
})
# val Name
# 1 1 A
# 2 1 A-1
# 3 2 B
# 4 2 C
# 5 3 C-1
# 6 2 C-2
+3
source to share