In dplyr, how can I sample rows in a dataframe with conditions (sum and min) applied to one or more columns?
I have data that looks like this:
Object Rank Cost
OBJ1 1 3
OBJ2 2 3
OBJ3 3 2.5
OBJ4 4 1.5
OBJ5 5 0
OBJ6 6 1
OBJ7 7 0
OBJ8 8 0
OBJ9 9 1
OBJ10 10 0
OBJ11 11 2
OBJ12 12 1
OBJ13 13 2.5
OBJ14 14 1
OBJ15 15 1
OBJ16 16 3
OBJ17 17 0
OBJ18 18 0
OBJ19 19 0
I want to use dplyr to randomly select 5 rows so that the sum of the cost column for those 5 rows is exactly 5 and the sum of the rank for the 5 samples taken is the smallest possible. This is a simplified example, but my actual data has a lot more rows. Any ideas how to do this without writing a loop?
Here's the data:
x <- structure(list(Object = structure(c(1L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L),
.Label = c("OBJ1", "OBJ10", "OBJ11", "OBJ12", "OBJ13", "OBJ14", "OBJ15", "OBJ16", "OBJ17", "OBJ18", "OBJ19", "OBJ2", "OBJ3", "OBJ4", "OBJ5", "OBJ6", "OBJ7", "OBJ8", "OBJ9"), class = "factor"),
Rank1 = 1:19, Cost = c(3, 3, 2.5, 1.5, 0, 1, 0, 0, 1, 0, 2, 1, 2.5, 1, 1, 3, 0, 0, 0)),
.Names = c("Object", "Rank1", "Cost"), class = "data.frame", row.names = c(NA, -19L))
+3
source to share
No one has answered this question yet
Check out similar questions: