Merging two dataframes in python on a column with non-true values

I am trying to concatenate two dataframes in Python based on column "X".

Column X in the left data frame has non-unique values, and column X on the right side of the data has unique values. How can I concatenate the values ​​from the desired dataframe to the left dataframe?

I want to concatenate lines from df2 to df1 to form df3

df1 = pd.DataFrame({'A': ['NA','EU','LA','ME'],
                    'B': [50, 23,21,100],
                    'X': ['IW233', 'IW455', 'IW455', 'IW100']})

df2 = pd.DataFrame({'C': [50, 12, 12, 11, 10, 16],
                    'X': ['IW455', 'IW200', 'IW233', 'IW150', 'IW175', 'IW100'],
                    'D': ['Aug', 'Sep', 'Jan', 'Feb', 'Dec', 'Nov']})

      

df3: 1

+3


source to share


2 answers


You can use merge

with left join if only X

join with column on

parameter can be omitted:

df = pd.merge(df1, df2, how='left')
print (df)
    A    B      X   C    D
0  NA   50  IW233  12  Jan
1  EU   23  IW455  50  Aug
2  LA   21  IW455  50  Aug
3  ME  100  IW100  16  Nov

      



If several of the same column names:

df = pd.merge(df1, df2, on='X', how='left')
print (df)
    A    B      X   C    D
0  NA   50  IW233  12  Jan
1  EU   23  IW455  50  Aug
2  LA   21  IW455  50  Aug
3  ME  100  IW100  16  Nov

      

+3


source


You can use the operator here join

:

>>> df1.join(df2.set_index('X'),on='X')
    A    B      X   C    D
0  NA   50  IW233  12  Jan
1  EU   23  IW455  50  Aug
2  LA   21  IW455  50  Aug
3  ME  100  IW100  16  Nov

      



So, let's first change the index of the right frame to X

(since they are unique in the right frame, which is not a problem). Then we do the join on the column X

.

0


source







All Articles