Pandas convert data frame to Utf-8

Question

Pandas convert data frame to Utf-8

I have df

one consisting of 100 rows and 24 columns. The column type is string. This is throwing me the following error when I was trying to add a dataframe to KDB

UnicodeEncodeError: 'ascii' codec can't encode character '\xd3' in position 9: ordinal not in range(128)

Here is an example of the first line in my df.

                        AnnouncementDate AuctionDate    BBT  \
_id
00000067   2012-12-11T00:00:00.000+00:00         NaN   FHLB

           CouponDividendRate DaysToSettle  \
_id
00000067                 0.61            1

                                        Description  \
_id
00000067                         FHLB 0.61 12/28/16

                     FirstSettlementDate           ISN IsAgency IsWhenIssued  \
_id
00000067   2012-12-28T00:00:00.000+00:00  US313381K796     True        False


           ...  OnTheRunTreasury OperationalIndicator  \
_id        ...
00000067   ...               NaN                False


          OriginalAmountOfPrincipal OriginalMaturityDate  \
_id
00000067                 13000000.0                  NaN


          PrincipalAmountOutstanding       SCSP       SMCP  \
_id
00000067                         0.0  313381K79   76000000

           SecurityTypeLevel1 SecurityTypeLevel2   TCK
_id
00000067          US-DOMESTIC                NaN   NaN

My question is, is there an easy way to convert my format df

to utf-8?

Perhaps something like df = df.encode('utf-8')

thank

+3

python pandas utf-8

Chris johnson 31 jul. 17 at 20:04

source to share

1 answer

Ricky McMaster · Answer 1 · 2017-11-16T12:03:48+0000

It depends on how you output the data. If you are just using csv files that you then import into KDB, you can easily specify this:

df.to_csv('df_output.csv', encoding='utf-8')

Or, you can set the encoding when you initially import data into Pandas using the same syntax.

If you are connecting directly to KDB using SQLAlchemy or something similar, try specifying that in the connection itself - see this question: Another UnicodeEncodeError when using Pandas to_sql method with MySQL

Pandas convert data frame to Utf-8

More articles: