Change vector number based on "group" size
I have an integer vector called x
and I would like to change the values based on the group size in order (from largest to smallest).
Here dput
:
c(6, 5, 5, 2, 6, 6, 2, 6, 3, 2, 4, 2, 4, 6, 1, 2, 6, 5, 2, 4, 2, 2, 6, 2, 4, 5, 5, 2, 6, 6, 5, 5, 6, 6, 6, 5, 5, 2, 6, 6, 2, 6, 3, 2, 4, 2, 4, 6, 6, 2, 6, 5, 5, 4, 2, 2, 5, 2, 4, 5, 5, 2, 6, 2, 5, 6, 6, 6)
Here's the output from table
:
> table(x)
x
1 2 3 4 5 6
1 20 2 8 15 22
So, 6
in x
must become 1
(because it 6
appears most often), 2
's' must become 2
, 5
must become 3
, etc.
Does anyone know of an elegant way to do this? I came up with a partially participatory solution as.integer(names(table[rev(order(table))]))
, but this is extremely ugly.
EDIT: @Richard Scriven's answer worked for some of my vectors, but something strange happens when the number of "groups" (ie the number of unique integers) increases. Here's another example:
> dput(x)
c(8, 1, 2, 8, 15, 15, 8, 15, 3, 8, 13, 8, 15, 15, 4, 8, 5, 13,
13, 13, 8, 6, 15, 8, 7, 13, 13, 8, 15, 8, 14, 13, 15, 15, 15,
13, 13, 8, 15, 15, 8, 15, 9, 8, 15, 8, 15, 15, 15, 15, 13, 15,
13, 10, 8, 11, 13, 8, 12, 13, 13, 8, 15, 8, 14, 15, 16, 15)
> table(x)
x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 1 1 1 1 1 18 1 1 1 1 14 2 22 1
> tbl <- sort(table(x), decreasing=T)
> tbl
x
15 8 13 14 1 2 3 4 5 6 7 9 10 11 12 16
22 18 14 2 1 1 1 1 1 1 1 1 1 1 1 1
> x.new <- as.integer(names(tbl))[x]
> table(x.new)
x.new
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
1 1 1 18 1 1 1 1 1 14 2 22 1 1 1 1
Any ideas why this isn't working?
EDIT 2: The solution that seems to work would be circular:
for (i in seq_along(tbl)) {
x.new[which(x == as.integer(names(tbl))[i])] <- i
}
source to share
Another idea with order
, to avoid some conversions and splitting:
x2 = match(x, order(table(x), decreasing = TRUE))
table(x)
#x
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
# 1 1 1 1 1 1 1 18 1 1 1 1 14 2 22 1
table(x2)
#x2
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
#22 18 14 2 1 1 1 1 1 1 1 1 1 1 1 1
x2[x == 14]
#[1] 4 4
x2[x == 8]
# [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
source to share
You can sort()
table and then use names from the vector-indexed table on x
.
tbl <- sort(table(x), decreasing = TRUE)
as.integer(names(tbl))[x] ## or rank(names(tbl))[x]
which gives
# [1] 1 3 3 2 1 1 2 1 5 2 4 2 4 1 6 2 1 3 2 4 2 2 1 2 4 3 3 2 1 1 3 3
# [33] 1 1 1 3 3 2 1 1 2 1 5 2 4 2 4 1 1 2 1 3 3 4 2 2 3 2 4 3 3 2 1 2
# [65] 3 1 1 1
Where
x <- c(6, 5, 5, 2, 6, 6, 2, 6, 3, 2, 4, 2, 4, 6, 1, 2, 6, 5, 2, 4,
2, 2, 6, 2, 4, 5, 5, 2, 6, 6, 5, 5, 6, 6, 6, 5, 5, 2, 6, 6, 2,
6, 3, 2, 4, 2, 4, 6, 6, 2, 6, 5, 5, 4, 2, 2, 5, 2, 4, 5, 5, 2,
6, 2, 5, 6, 6, 6)
source to share