R-loop for anova on multiple files
I would like to execute anova
on multiple datasets stored in my working directory. I have come up with:
files <- list.files(pattern = ".csv")
for (i in seq_along(files)) {
mydataset.i <- files[i]
AnovaModel.1 <- aov(DES ~ DOSE, data=mydataset.i)
summary(AnovaModel.1)
}
As you can see, I am very new to loops and cannot get the job done. I also understand that I need to add code to add all of the resulting outputs to one file. I would appreciate any help you can provide to guide a runtime that can execute anovas on multiple CSV files in a directory (same headers) and generate write output.
source to share
you can use list.files
with full.names = TRUE
if you are not on the same track.
files <- list.files("path_to_my_dir", pattern="*.csv", full.names = T)
# use lapply to loop over all files
out <- lapply(1:length(files), function(idx) {
# read the file
this.data <- read.csv(files[idx], header = TRUE) # choose TRUE/FALSE accordingly
aov.mod <- aov(DES ~ DOSE, data = this.data)
# if you want just the summary as object of summary.aov class
summary(aov.mod)
# if you require it as a matrix, comment the previous line and uncomment the one below
# as.matrix(summary(aov.mod)[[1]])
})
head(out)
This should give you list
with each entry in the list c summary matrix
in the same order as the list of input files.
source to share
Your mistake is that your loop is not loading your data. Your list of filenames is in "files", then you step through that list and set mydataset.i to the filename that matches your iterator i ... but then you try to run aov on the filename that is stored in mydataset.i !
The command you are looking for to redirect output to a file is the sink. Consider the following:
sink("FileOfResults.txt") #starting the redirect to the file
files <- list.files("path_to_my_dir", pattern="*.csv", full.names = T) #using the fuller code from Arun
for (i in seq_along(files)){
mydataset.i <- files[i]
mydataset.d <- read.csv(mydataset.i) #this line is new
AnovaModel.1 <- aov(DES ~ DOSE, data=mydataset.d) #this line is modified
print(summary(AnovaModel.1))
}
sink() #ending the redirect to the file
I prefer this approach for Arun because the results are stored directly in the file without jumping over the list, and then to figure out how to store the list in a file in a readable way.
source to share