Condition and row and column double loop

I have an "out of bounds index" problem, I want to get the first and last month for each observation where I have three consecutive "1" or "True". I want to create 2 new columns "begin" and "end" where I will get the corresponding first month and last month. In my example for the first observation: start equal to avril and end equal to juin In 5 observations: start equal to ferrier and finish equal to avril In 9 observations: start equal to January and end equal to Mars ...

I tried to do this:

nom <- letters[1:5]
pseudo <- paste(name, 21:25, sep = "")
janvier <- c(0, 1, 1, 1, 0)
fevrier <- c(1, 1, 1, 1, 1)
mars <- c(0, 0, 0, 1, 1)
avril <- c(1, 1, 1, 0, 1)
mai <- c(1, 0, 1, 1, 1)
juin <- c(1, 1, 0, 1, 0)

df <- data.frame(nom =nom, pseudo = pseudo, janvier = janvier,
                 fevrier = fevrier, mars = mars, avril = avril,
                 mai = mai, juin = juin)

dfm <- as.matrix(df[, -c(1, 2)])

my_matrix <- matrix(nrow = 10, ncol = 6)


for(i in 1:dim(dfm)[1]){
  for(j in 1:dim(dfm)[2]){
    if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
      my_matrix[i, j] <- "periode_ok"
      my_matrix[i, j+1] <- "periode_ok"
      my_matrix[i, j+2] <- "periode_ok"
    } 
  }
}

      

The output should be as follows:

begin <- c("avril", "no  info", "no info",
           "janvier", "fevrier", "avril", "no info",
           "no info", "janvier", "fevrier")
end <- c("juin", "no info", "no info", "mars",
         "avril", "juin", "no info", "no info",
         "mars", "avril")

output <- data.frame(nom =nom, pseudo = pseudo, janvier = janvier,
                 fevrier = fevrier, mars = mars, avril = avril,
                 mai = mai, juin = juin, begin = begin,end = end)

      

Any help would be appreciated

+3


source to share


2 answers


First of all, type constructs are 1:dim(dfm)[1]

dangerous, because if dim(dfm)[1]

equal to zero, you will get an absolutely valid vector 1:0

, and the loop will try to access the zero element of the vector, or in this case, the matrix. This is illegal and throws an error. The recommended solution is to use seq_len(...)

. Second, dim(dfm)[.]

I used nrow

and instead ncol

. Now for your mistake. You are trying to refer to columns j + 1

and j + 2

, therefore, when it j

reaches ncol(dfm)

, you are not bound by bonds. The code below removes the last two elements of the loop constraint.



n <- ncol(dfm)
for(i in seq_len(nrow(dfm))){
  for(j in seq_len(n)[-c(n - 1, n)]){
    if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
      my_matrix[i, j] <- "periode_ok"
      my_matrix[i, j+1] <- "periode_ok"
      my_matrix[i, j+2] <- "periode_ok"
    } 
  }
}

my_matrix

      

+4


source


Of course, there is a vectorized solution for this, but if you want to fix the for loop, you need to constrain j

to size dfm

minus 2 as you are checking if there are two columns ahead. Based on what you have provided this will help you; however, it is not clear how you get 10 lines (duplicate twice) out of 5 lines df

.

      my_matrix <- matrix("no info", nrow = 5, ncol = 2)
      colnames(my_matrix) <- c("begin", "end")


      for(i in 1:dim(dfm)[1]){
        for(j in 1:(dim(dfm)[2]-2)){
          if(dfm[i, j] + dfm[i, j+1] + dfm[i, j+2] == 3){
            my_matrix[i, 1] <- colnames(dfm)[j]
            my_matrix[i, 2] <- colnames(dfm)[j+2]
            break
          } 
        }
      }


output <- cbind(df, my_matrix)

      



Then the result will be:

output

#   nom pseudo janvier fevrier mars avril mai juin   begin     end 
# 1   a name21       0       1    0     1   1    1   avril    juin 
# 2   b name22       1       1    0     1   0    1 no info no info 
# 3   c name23       1       1    0     1   1    0 no info no info 
# 4   d name24       1       1    1     0   1    1 janvier    mars 
# 5   e name25       0       1    1     1   1    0 fevrier   avril

      

+2


source







All Articles