All combinations with few restrictions
I want to generate all possible combinations of a set of numbers, but with a few restrictions. I found several similar questions on Stack Overflow, but none of them fit all my limitations:
R: sample () command obeys restrictions
R all combinations of 3 vectors with conditions
Generate all combinations subject to constraints
R - generate all combinations of 2 vectors of given constraints
Below is an example dataset. In my opinion, this is a deterministic dataset.
desired.data <- read.table(text = '
x1 x2 x3 x4
1 1 1 1
1 1 1 2
1 1 1 3
1 1 2 1
1 1 2 2
1 1 2 3
1 1 3 3
1 2 1 1
1 2 1 2
1 2 1 3
1 2 2 1
1 2 2 2
1 2 2 3
1 2 3 3
1 3 3 3
0 1 1 1
0 1 1 2
0 1 1 3
0 1 2 1
0 1 2 2
0 1 2 3
0 1 3 3
0 0 1 1
0 0 1 2
0 0 1 3
0 0 0 1
', header = TRUE, stringsAsFactors = FALSE, na.strings = 'NA')
Here are the limitations:
- Column 1 can only contain 0 or 1
- The last column can only contain 1, 2 or 3
- All other columns can contain 0, 1, 2 or 3
- When a non-0 appears on a line, the rest of that line cannot contain another 0
- Once 3 appears in a row, the rest of that line should only contain 3
- The first number other than 0 must be 1
The only way I can create this dataset is to use a nested for-loops
one as shown below. I've been using this technique for years and finally decided to ask if there could be a better way.
I hope this is not a duplicate and I hope it is not considered too specialized. I often create these types of datasets and a simpler solution would be quite helpful.
my.data <- matrix(0, ncol = 4, nrow = 25)
my.data <- as.data.frame(my.data)
j <- 1
for(i1 in 0:1) {
if(i1 == 0) i2.begin = 0
if(i1 == 0) i2.end = 1
if(i1 == 1) i2.begin = 1
if(i1 == 1) i2.end = 3
if(i1 == 2) i2.begin = 1
if(i1 == 2) i2.end = 3
if(i1 == 3) i2.begin = 3
if(i1 == 3) i2.end = 3
for(i2 in i2.begin:i2.end) {
if(i2 == 0) i3.begin = 0
if(i2 == 0) i3.end = 1
if(i2 == 1) i3.begin = 1
if(i2 == 1) i3.end = 3
if(i2 == 2) i3.begin = 1
if(i2 == 2) i3.end = 3
if(i2 == 3) i3.begin = 3
if(i2 == 3) i3.end = 3
for(i3 in i3.begin:i3.end) {
if(i3 == 0) i4.begin = 1 # 1 not 0 because last column
if(i3 == 0) i4.end = 1
if(i3 == 1) i4.begin = 1
if(i3 == 1) i4.end = 3
if(i3 == 2) i4.begin = 1
if(i3 == 2) i4.end = 3
if(i3 == 3) i4.begin = 3
if(i3 == 3) i4.end = 3
for(i4 in i4.begin:i4.end) {
my.data[j,1] <- i1
my.data[j,2] <- i2
my.data[j,3] <- i3
my.data[j,4] <- i4
j <- j + 1
}
}
}
}
my.data
dim(my.data)
Here's the result:
V1 V2 V3 V4
1 0 0 0 1
2 0 0 1 1
3 0 0 1 2
4 0 0 1 3
5 0 1 1 1
6 0 1 1 2
7 0 1 1 3
8 0 1 2 1
9 0 1 2 2
10 0 1 2 3
11 0 1 3 3
12 1 1 1 1
13 1 1 1 2
14 1 1 1 3
15 1 1 2 1
16 1 1 2 2
17 1 1 2 3
18 1 1 3 3
19 1 2 1 1
20 1 2 1 2
21 1 2 1 3
22 1 2 2 1
23 1 2 2 2
24 1 2 2 3
25 1 2 3 3
26 1 3 3 3
EDIT
Sorry that initially I forgot to include Constraint # 6.
source to share
Like @mrip, start with expand.grid
, which can handle the first 3 constraints as they don't interact with other columns
step1<-expand.grid(0:1,0:3,0:3,1:3)
Then I will filter it. The difference between this approach and mrip is that my filtering is in one application instead of 3, so it takes about 3x faster to filter.
filtered<-step1[apply(step1,1,function(x) all(if(length(which(x==0))>0) {max(which(x==0))==length(which(x==0))} else {TRUE}, if(length(which(x==3))>0) {min(which(x==3))==length(x)-length(which(x==3))+1} else {TRUE}, x[!x%in%0][1]==1)),]
It should be like this. If you want to check every element inside the application, this:
if(length(which(x==0))>0) {max(which(x==0))==length(which(x==0))} else {TRUE}
If there are zeros, then it makes sure nothing happens before zero
if(length(which(x==3))>0) {min(which(x==3))==length(x)-length(which(x==3))+1} else {TRUE}
If there are any 3s, it ensures that there is nothing after them.
x[!x%in%0][1]==1)
This filters the zeros from the string first and then takes the first element of the string after that filter and only lets it be one.
source to share
Here is the code that creates the desired dataset for this particular example. I suspect the code might be generalized. If I can manage to generalize it, I will post the result. Although the code is messy and not intuitive, I am convinced that there is a common common pattern.
desired.data <- read.table(text = '
x1 x2 x3 x4
1 1 1 1
1 1 1 2
1 1 1 3
1 1 2 1
1 1 2 2
1 1 2 3
1 1 3 3
1 2 1 1
1 2 1 2
1 2 1 3
1 2 2 1
1 2 2 2
1 2 2 3
1 2 3 3
1 3 3 3
0 1 1 1
0 1 1 2
0 1 1 3
0 1 2 1
0 1 2 2
0 1 2 3
0 1 3 3
0 0 1 1
0 0 1 2
0 0 1 3
0 0 0 1
', header = TRUE, stringsAsFactors = FALSE, na.strings = 'NA')
n <- 3 # non-zero numbers
m <- 4-2 # number of middle columns
x1 <- rep(1:0, c(((n*(n-1)) * (n-1) + n), (n*(n-1) + n + (n-1))))
x2 <- rep(c(1:n, 1:0), c(n*m+1, n*m+1, 1, n*m+1, n*1+1))
x3 <- rep(c(rep(1:n, n-1), n, 1:n, 1:0), c(rep(c(n,n,1), n-1), 1, n,n,1, n,1))
x4 <- c(rep(c(rep(1:n, (n-1)), n), (n-1)), n, rep(1:n,(n-1)), n, 1:n, 1)
my.data <- data.frame(x1, x2, x3, x4)
all.equal(desired.data, my.data)
# [1] TRUE
source to share
I would use expand.grid
to create all combinations and then a subset, one constraint at a time:
x<-expand.grid(0:1,0:3,0:3,1:3)
## Once a non-0 appears in a row the rest of that row cannot contain another 0
b1<-apply(x,1,function(z) min(diff(z!=0))==0)
x<-x[b1,]
## Once a 3 appears in a row the rest of that row must only contain 3's
b1<-apply(x,1,function(z) min(diff(z==3))==0)
x<-x[b1,]
## The first non-0 number in a row must be a 1
b1<-apply(x,1,function(z) {
w<-which(z==0)
length(w)==0 || z[tail(w,1)+1]==1
})
x<-x[b1,]
And now collect it:
x<-x[order(x[,1],x[,2],x[,3],x[,4]),]
x
Output:
Var1 Var2 Var3 Var4
1 0 0 0 1
9 0 0 1 1
41 0 0 1 2
73 0 0 1 3
11 0 1 1 1
43 0 1 1 2
75 0 1 1 3
19 0 1 2 1
51 0 1 2 2
83 0 1 2 3
91 0 1 3 3
12 1 1 1 1
44 1 1 1 2
76 1 1 1 3
20 1 1 2 1
52 1 1 2 2
84 1 1 2 3
92 1 1 3 3
14 1 2 1 1
46 1 2 1 2
78 1 2 1 3
22 1 2 2 1
54 1 2 2 2
86 1 2 2 3
94 1 2 3 3
96 1 3 3 3
source to share