Reading csv zip files in python
7 replies
Thought Yaron had a better answer, but thought I would add some code that iterates over multiple files inside a zip folder. Then it will add the results:
import os
import pandas as pd
import zipfile
curDir = os.getcwd()
zf = zipfile.ZipFile(curDir + '/targetfolder.zip')
text_files = zf.infolist()
list_ = []
print ("Uncompressing and reading data... ")
for text_file in text_files:
print(text_file.filename)
df = pd.read_csv(zf.open(text_file.filename)
# do df manipulations
list_.append(df)
df = pd.concat(list_)
+2
source to share
If you are not using Pandas, this can be done entirely with the standard lib library. Here is Python 3.7 code:
import csv
from io import TextIOWrapper
from zipfile import ZipFile
with ZipFile('yourfile.zip') as zf:
with zf.open('your_csv_inside_zip.csv', 'r') as infile:
reader = csv.reader(TextIOWrapper(infile), 'UTF-8')
for row in reader:
# process the CSV here
print(row)
+2
source to share
Modern Pandas since version 0.18.1 support compressed CSV files natively : the read_csv method has a compression parameter: {'infer', 'gzip', 'bz2', 'zip', 'xz', None}, default is "output".
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
+1
source to share