Chunksize keyword for read_excel not implemented

An argument was available in version 0.16.1 chunksize

.

See: http://pandas.pydata.org/pandas-docs/version/0.16.1/generated/pandas.ExcelFile.parse.html

But in the latest version it is not available.

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.ExcelFile.parse.html

What is the reason for its removal?

Also, how should I handle the excel file in chunks in the latest version?

I used below:

import pandas as pd

excel = pd.ExcelFile("test.xlsx")

for sheet in excel.sheet_names:
    reader = excel.parse(sheet, chunksize=1000)
    for chunk in reader:
        # process chunk

      

0


source to share


1 answer


As EdChum explained in a comment, this feature was removed in 0.17.0. Chris gave the following reason in a comment:

there is no super attractive reason; the basic idea was that the api to_excel, that is, "ExcelFileWrapper" (ExcelFile, ExcelWriter) doesn't have any pandas-specific functionality, instead you pass it in the io (read_excel, to_excel) functions.

I've updated the docs to describe this specific example. edit: although it might be difficult to see in the description below.



Source: https://github.com/pandas-dev/pandas/pull/11198

I'm still wondering if there is an alternative way to read excel in chunks?

+1


source







All Articles