Pandas Concatenate data with different indices
I have two frames df_1
and df_2
with different indexes and columns. However, there are some indexes and columns that overlap.
I created a dataframe df
with concatenation of indexes and columns: so no indexes or columns are repeated.
I would like to populate a dataframe df
like this:
for x in df.index:
for y in df.columns:
df.loc[x,y] = df_1.loc[x,y] if (x,y) in (df_1.index,df_1.columns) else df_2.loc[x,y]
Can anyone tell me an efficient way to do this?
Thank!
+3
source to share
1 answer
I think you need DataFrame.combine_first
:
df_1 = pd.DataFrame({'A':[1,2,3],
'E':[4,5,6],
'V':[7,8,9],
'D':[1,3,5]},
index=pd.to_datetime(['2017-01-05', '2017-01-04', '2017-01-01']))
print (df_1)
A D E V
2017-01-05 1 1 4 7
2017-01-04 2 3 5 8
2017-01-01 3 5 6 9
df_2 = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9]}, index=pd.date_range('2017-01-01', periods=3)) * 10
print (df_2)
A B C
2017-01-01 10 40 70
2017-01-02 20 50 80
2017-01-03 30 60 90
df = df_1.combine_first(df_2)
print (df)
A B C D E V
2017-01-01 3.0 40.0 70.0 5.0 6.0 9.0
2017-01-02 20.0 50.0 80.0 NaN NaN NaN
2017-01-03 30.0 60.0 90.0 NaN NaN NaN
2017-01-04 2.0 NaN NaN 3.0 5.0 8.0
2017-01-05 1.0 NaN NaN 1.0 4.0 7.0
+3
source to share