Chunksize keyword for read_excel not implemented
An argument was available in version 0.16.1 chunksize
.
See: http://pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html
But in the latest version it is not available.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html
What is the reason for its removal?
Also, how should I handle the excel file in chunks in the latest version?
I used below:
import pandas as pd
excel = pd.ExcelFile("test.xlsx")
for sheet in excel.sheet_names:
reader = excel.parse(sheet, chunksize=1000)
for chunk in reader:
# process chunk
source to share
As EdChum explained in a comment, this feature was removed in 0.17.0. Chris gave the following reason in a comment:
there is no super attractive reason; the basic idea was that the api to_excel, that is, "ExcelFileWrapper" (ExcelFile, ExcelWriter) doesn't have any pandas-specific functionality, instead you pass it in the io (read_excel, to_excel) functions.
I've updated the docs to describe this specific example. edit: although it might be difficult to see in the description below.
Source: https://github.com/pandas-dev/pandas/pull/11198
I'm still wondering if there is an alternative way to read excel in chunks?
source to share