How do I run a for loop through a dataframe string vector in R?

I'm trying to do something very simple: run a loop through a vector of names and use those names in my code.

geo = c(rep("AT",3),rep("BE",3))
time = c(rep(c("1990Q1","1990Q2","1990Q3"),2))
value = c(1:6)
Data <- data.frame(geo,time,value)

      

My real dataset has 14 countries and 75 time periods. I would like to find a function that, for example, walks through countries and then multiplies them, so I have separate datasets like:

data_AT <- subset(Data, (Data$geo=="AT"))
data_BE <- subset(Data, (Data$geo=="BE"))

      

but with a loop and ideally with a solution, I can apply other functions as well :-)

In my opinion, it should look something like this:

codes <- unique(Data$geo)
for (i in 1:length(codes))
{k <- codes[i]
data_(k) <- subset(Data, (Data$geo==k))}

      

however, the subset does not work like the other functions. I think my problem is that I don't know how to address the appropriate name that "k" took (eg "AT") as part of my code. If at all possible, I would really appreciate an answer with a general solution, how can I run a function through a vector containing text, and use every element of that vector in my code. Maybe in the direction of the functions used? Although I'm not very far from it ...

Any help would be much appreciated!

+3


source to share


2 answers


I use loops for simyral purposes. This may not be the fastest way, but at least I understand it - for example, when saving graphs for different subsets.

No need to iterate over the length of the vector, you can loop through the vector itself. You can use assignment to convert string to variable name.



geo = c(rep("AT",3),rep("BE",3))
time = c(rep(c("1990Q1","1990Q2","1990Q3"),2))
value = c(1:6)
Data <- data.frame(geo,time,value)

codes <- sort(unique(Data$geo))
for (k in codes) {
 name<-paste("data", k, sep="_")
 assign(name, subset(Data, (Data$geo==k)))
}

      

By the way, the filter from the dplyr package is much faster than the subset!

+2


source


In R, you usually do this with list

data.files instead of several separate data.frames:

lst <- split(Data, Data$geo)
lst
#$AT
#  geo   time value
#1  AT 1990Q1     1
#2  AT 1990Q2     2
#3  AT 1990Q3     3
#
#$BE
#  geo   time value
#4  BE 1990Q1     4
#5  BE 1990Q2     5
#6  BE 1990Q3     6

      

Now you can access each element (which is data.frame) by typing:

lst[["AT"]]
#  geo   time value
#1  AT 1990Q1     1
#2  AT 1990Q2     2
#3  AT 1990Q3     3

      



If you have a vector of country names for which you want to add +1 to the value column, you can do it like this:

cntrs <- c("BE", "AT")
lst[cntrs] <- lapply(lst[cntrs], function(x) {x$value <- x$value + 1; return(x)} )
#$BE
#  geo   time value
#4  BE 1990Q1     5
#5  BE 1990Q2     6
#6  BE 1990Q3     7
#
#$AT
#  geo   time value
#1  AT 1990Q1     2
#2  AT 1990Q2     3
#3  AT 1990Q3     4

      

Edit: If you really want to stick with the for loop, I recommend not splitting the data into multiple separate data.frames, but rather running the loop over the entire dataset, like this:

cntrs <- "BE"  

for(i in cntrs){
   Data$value[Data$geo == i] <- Data$value[Data$geo == i] + 1
}

      

+2


source







All Articles