In R, scroll through the directory and store the filename in a column

I'm trying to do something in R that shouldn't be too heavy I guess. I have a folder with many, many files. They all look like this.




delimiter, .lst

extension (read as text).

Each file contains data per line like

/home/nobackup/SONAR/COMPACT/WR-U-E-A/  <sentence>ja voor den airbag op te pompen eh :p</sentence>
/home/nobackup/SONAR/COMPACT/WR-U-E-A/  <sentence>Dobby , als ze valt heeft ze dan wel al ne airbag hee</sentence>


What I want to do is, in R, create a new dataset containing data from all files. Ideally it would look like this:

ID | filename             | word | component | left-context                               | right-context
1    airbag.WS-U-E-A.lst   airbag   WS-U-E-A    ja voor den                                  op te pompen eh :p
2    airbag.WS-U-E-A.lst   airbag   WS-U-E-A    Dobby , als ze valt heeft ze dan wel al ne   hee


Generating all of this content is something I should be able to do with some regex in files, however I'm not really sure how to encode all the files. For example, I would get the component and word information from the regex function in the filename, but how do I store the filename for each file in a column?

I tried the following

files <- list.files(path="", pattern="*.lst", full.names=T, recursive=FALSE)
lapply(files, function(x) {
    t <- dirname(x)
    out <- function(t)



But the error received was

Error: unexpected '}' in:
"out <- function(t)



source to share

1 answer

As David Arrenburg posted in the comments (not yet responded to post in reply: D), the solution is to use a function apply

for files.

lapply(files, basename


which will output a list()

. For convenience, it would be better to get a vector. In this case, use sapply


sapply(files, basename)




All Articles