python pandas column dtype = object causing the merge to fail with: DtypeWarning: columns are of mixed types
I am trying to combine two df1, df2
in a column Customer_ID
. Seems to Customer_ID
have the same datatype ( object
) in both.
df1:
Customer_ID | Flag
12345 A
df2:
Customer_ID | Transaction_Value
12345 258478
When I concatenate two tables:
new_df = df2.merge(df1, on='Customer_ID', how='left')
For some Customer_IDs it worked, but for others it didn't. For this example, I would get a result like this:
Customer_ID | Transaction_Value | Flag
12345 258478 NaN
I checked the datatypes and they are the same:
df1.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 873353 entries, 0 to 873352
Data columns (total 2 columns):
Customer_ID 873353 non-null object
Flag 873353 non-null object
dtypes: object(2)
memory usage: 20.0+ MB
df2.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 873353 entries, 0 to 873352
Data columns (total 2 columns):
Customer_ID 873353 non-null object
Transaction_Value 873353 int64
dtypes: object(2)
memory usage: 20.0+ MB
When I downloaded df1 I got this message:
C:\Users\xxx\AppData\Local\Continuum\Anaconda2\lib\site-packages\IPython\core\interactiveshell.py:2717: DtypeWarning: Columns (1) have mixed types. Specify dtype option on import or set low_memory=False.
interactivity=interactivity, compiler=compiler, result=result)
When I wanted to check if the client ID exists, I realized that I have to specify it differently in the two data frames.
df1.loc[df1['Customer_ID'] == 12345]
df2.loc[df2['Customer_ID'] == '12345']
source to share
Customer_ID
matters dtype==object
in both cases ... But this does not mean that the individual elements are of the same type. You need to do both str
, and soint
Using int
dtype = dict(Customer_ID=int)
df1.astype(dtype).merge(df2.astype(dtype), 'left')
Customer_ID Flag Transaction_Value
0 12345 A 258478
Using str
dtype = dict(Customer_ID=str)
df1.astype(dtype).merge(df2.astype(dtype), 'left')
Customer_ID Flag Transaction_Value
0 12345 A 258478
source to share