Python pandas read_excel returns UnicodeDecodeError for description ()
I like pandas, but I'm having real problems with Unicode errors. read_excel () returns horrible Unicode error:
import pandas as pd
df=pd.read_excel('tmp.xlsx',encoding='utf-8')
df.describe()
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 259: ordinal not in range(128)
I realized that the original Excel had (non-breaking space) at the end of many cells, probably to avoid converting long-digit strings to float.
One way is to remove the cells, but there must be something better.
for col in df.columns:
df[col]=df[col].str.strip()
I am using anaconda2.2.0 win64, pandas 0.16
source to share
Hope this helps someone.
I had this error ...
UnicodeDecodeError: 'ascii' codec can't decode byte ....
after reading the Excel file df = pd.read_excel...
and trying to assign a new column to the data likedf['new_col'] = 'foo bar'
After a closer look, I found that the problem ... there were some columns in the dataframe 'nan'
due to missing column headers. After removing the "nan" columns using the following code .. everything else was fine.
df = df.dropna(axis=1,how='all')
source to share