Merging two dataframes in python on a column with non-true values
I am trying to concatenate two dataframes in Python based on column "X".
Column X in the left data frame has non-unique values, and column X on the right side of the data has unique values. How can I concatenate the values ββfrom the desired dataframe to the left dataframe?
I want to concatenate lines from df2 to df1 to form df3
df1 = pd.DataFrame({'A': ['NA','EU','LA','ME'],
'B': [50, 23,21,100],
'X': ['IW233', 'IW455', 'IW455', 'IW100']})
df2 = pd.DataFrame({'C': [50, 12, 12, 11, 10, 16],
'X': ['IW455', 'IW200', 'IW233', 'IW150', 'IW175', 'IW100'],
'D': ['Aug', 'Sep', 'Jan', 'Feb', 'Dec', 'Nov']})
df3: 1
source to share
You can use merge
with left join if only X
join with column on
parameter can be omitted:
df = pd.merge(df1, df2, how='left')
print (df)
A B X C D
0 NA 50 IW233 12 Jan
1 EU 23 IW455 50 Aug
2 LA 21 IW455 50 Aug
3 ME 100 IW100 16 Nov
If several of the same column names:
df = pd.merge(df1, df2, on='X', how='left')
print (df)
A B X C D
0 NA 50 IW233 12 Jan
1 EU 23 IW455 50 Aug
2 LA 21 IW455 50 Aug
3 ME 100 IW100 16 Nov
source to share
You can use the operator here join
:
>>> df1.join(df2.set_index('X'),on='X')
A B X C D
0 NA 50 IW233 12 Jan
1 EU 23 IW455 50 Aug
2 LA 21 IW455 50 Aug
3 ME 100 IW100 16 Nov
So, let's first change the index of the right frame to X
(since they are unique in the right frame, which is not a problem). Then we do the join on the column X
.
source to share