Combine two data frames on a common column
I want to join two data sources, orders and customers:
orders is a SQL Server table:
orderid| customerid | orderdate | ordercost
------ | -----------| --------- | --------
12000 | 1500 |2008-08-09 | 38610
and clients are csv file:
customerid,first_name,last_name,starting_date,ending_date,country 1500,Sian,Read,2008-01-07,2010-01-07,Greenland
I want to join these two tables in my Python application, so I wrote the following code:
# Connect to SQL Sever with Pyodbc library
connection = pypyodbc.connect("connection string here")
cursor=connection.cursor();
cursor.execute("SELECT * from order)
result= cursor.fetchall()
# convert the result to pandas Dataframe
df1 = pd.DataFrame(result, columns= ['orderid','customerid','orderdate','ordercost'])
# Read CSV File
df2=pd.read_csv(customer_csv)
# Merge two dataframes
merged= pd.merge( df1, df2, on= 'customerid', how='inner')
print(merged[['first_name', 'country']])
I am waiting
first_name | country
-----------|--------
Sian | Greenland
But I am getting empty result.
When I execute this code for two dataframes which are both from CSV files and it works fine. Any help?
Thank.
source to share
I think the problem in columns customerid
is different dtypes
in both DataFrames
, so it doesn't match.
Therefore, you need to convert both columns to int
or both to str
.
df1['customerid'] = df1['customerid'].astype(int)
df2['customerid'] = df2['customerid'].astype(int)
Or:
df1['customerid'] = df1['customerid'].astype(str)
df2['customerid'] = df2['customerid'].astype(str)
It is also possible to omit how='inner'
, because the default is merge
:
merged= pd.merge( df1, df2, on= 'customerid')
source to share
an empty dataframe result for pd.merge means you don't have matching values ββin two frames. Have you checked the data type? use
df1['customerid'].dtype
for check.
as well as post-import conversion (as suggested in another answer), you can also tell pandas what dtype you want when you read the csv
df2=pd.read_csv(customer_csv, dtype={'customerid': str))
source to share