Fread with gunzip: What's more memory efficient?

If I have a large data file that is encrypted with gzip, say dat.gz

which is more memory efficient ?

mydat <- fread("gunzip -c dat.gz")


or, first unzip / unzip the file to say, dat

then do

mydat <- fread("dat")


I am dealing with memory, not speed, to prevent R from crashing.


source to share

1 answer

I wrote a 5000x5000 matrix for temp.csv and profiled memory usage in two approaches using profvis:

profvis({system("gunzip -c temp.csv.gz > temp.csv"); mat <- fread("temp.csv")})


Memory usage: 190.9 MB

profvis({fread("gunzip -c temp.csv.gz")})


Memory usage: 190.8 MB

I ran it several times and the memory usage ranged between 190-191 for both commands. So I concluded that the memory usage is the same.



All Articles