Fread: file not found
I'm trying to open a 2.2G file using fread
from a package data.table
, but keeping the same error (it works for other files that are less rigid):
library(data.table)
data.table 1.9.4 For help type: ?data.table
*** NB: by=.EACHI is now explicit. See README to restore previous behaviour.
train = data.table::fread('train.csv')
Error in data.table::fread("train.csv") :
file not found: train.csv
Of course the file is present ( read.csv()
works, but very slow). I am running Ubuntu 12.04 LTS, on i686. Appreciate any help!
NOTE . The file I'm trying to read is "train.gz" which can be found at: https://www.kaggle.com/c/tradeshift-text-classification/data .
This is a 2.2G csv file, pretty standard.
EDIT: When I use verbose=TRUE
it says:
Input contains no \n. Taking this to be a filename to open
source to share
Ok just to close the topic: I upgrade my Ubuntu to x86-64, fread
works fine now .
Just a summary to help developers:
1-Downloaded huge file (2.2G in this case)
2-Try to read with help fread
and get error:file not found: train.csv
I used Ubuntu 12.04 LTS x86 and R latest stable release.
As pointed out, small files (~ 731 MB) worked in this scenario. Thanks for your help anyway!
source to share
To open large files on 32-bit Linux systems, you must provide an O_LARGEFILE
option open
that fread
does not. This is a call open
that actually fails, but it erroneously reports a "file not found" error.
Another way to enable support for large files is to pass an option to the -D_FILE_OFFSET_BITS=64
compiler when installing the package. Unload and delete data.table
, put the following in ~/.R/Makevars
:
CFLAGS=-D_FILE_OFFSET_BITS=64
and then enter R CMD INSTALL /path/to/data.table_X.Y.Z.tar.gz
. A newly installed package will successfully open large files on a 32-bit system.
source to share