Fread: file not found

I'm trying to open a 2.2G file using fread

from a package data.table

, but keeping the same error (it works for other files that are less rigid):

library(data.table)
data.table 1.9.4  For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.

train  = data.table::fread('train.csv')

      

Error in data.table::fread("train.csv") : file not found: train.csv

Of course the file is present ( read.csv()

works, but very slow). I am running Ubuntu 12.04 LTS, on i686. Appreciate any help!

NOTE . The file I'm trying to read is "train.gz" which can be found at: https://www.kaggle.com/c/tradeshift-text-classification/data .

This is a 2.2G csv file, pretty standard.

EDIT: When I use verbose=TRUE

it says:

Input contains no \n. Taking this to be a filename to open

+3


source to share


2 answers


Ok just to close the topic: I upgrade my Ubuntu to x86-64, fread

works fine now .

Just a summary to help developers:

1-Downloaded huge file (2.2G in this case)



2-Try to read with help fread

and get error:file not found: train.csv

I used Ubuntu 12.04 LTS x86 and R latest stable release.

As pointed out, small files (~ 731 MB) worked in this scenario. Thanks for your help anyway!

+2


source


To open large files on 32-bit Linux systems, you must provide an O_LARGEFILE

option open

that fread

does not. This is a call open

that actually fails, but it erroneously reports a "file not found" error.

Another way to enable support for large files is to pass an option to the -D_FILE_OFFSET_BITS=64

compiler when installing the package. Unload and delete data.table

, put the following in ~/.R/Makevars

:



CFLAGS=-D_FILE_OFFSET_BITS=64

      

and then enter R CMD INSTALL /path/to/data.table_X.Y.Z.tar.gz

. A newly installed package will successfully open large files on a 32-bit system.

+2


source







All Articles