Fread with gunzip: What's more memory efficient?

If I have a large data file that is encrypted with gzip, say dat.gz

which is more memory efficient ?

mydat <- fread("gunzip -c dat.gz")

      

or, first unzip / unzip the file to say, dat

then do

mydat <- fread("dat")

      

I am dealing with memory, not speed, to prevent R from crashing.

+3


source to share


1 answer


I wrote a 5000x5000 matrix for temp.csv and profiled memory usage in two approaches using profvis:

profvis({system("gunzip -c temp.csv.gz > temp.csv"); mat <- fread("temp.csv")})

      

Memory usage: 190.9 MB



profvis({fread("gunzip -c temp.csv.gz")})

      

Memory usage: 190.8 MB

I ran it several times and the memory usage ranged between 190-191 for both commands. So I concluded that the memory usage is the same.

+9


source







All Articles