Map dataframe with multiple columns as pandas keys
>>> p1.head()
StreamId Timestamp SeqNum
0 3 1490250116391063414 1158
1 3 1490250116391348339 3600
2 3 1490250116391542829 3600
3 3 1490250116391577184 1437
4 3 1490250116392819426 1389
>>> oss.head()
OrderID Symbol Stream SeqNo
0 5000000 AXBANK 3 1158
1 5000001 AXBANK 6 1733
2 5000002 AXBANK 6 1244
3 5000003 AXBANK 6 1388
4 5000004 AXBANK 3 1389
How can this be combined using 2 attributes as key (SeqNum and StreamId)
>>> merge
OrderID Symbol Stream SeqNo Timestamp
0 5000000 AXBANK 3 1158 1490250116391063414
1 5000001 AXBANK 6 1733 NaN
2 5000002 AXBANK 6 1244 NaN
3 5000003 AXBANK 6 1388 NaN
4 5000004 AXBANK 3 1389 1490250116392819426
I tried to use
oss['Time1'] = oss['SeqNo'].map.((p1.set_index('SeqNum')['Timestamp']))
But I need to include both (SeqNum-SeqNo and Stream-StreamId) as keys. I know it can be easy if I rename the column names in both data files and use a merge, but I want to avoid it. I would rather use something generic (take THESE dataframe, draw THESE columns to THOSE columns to OTHER DATAFRAMEs, and extract the required coulmns)
+3
source to share
2 answers
I think you need merge
with drop
:
print (pd.merge(oss, p1, left_on=['Stream','SeqNo'],
right_on=['StreamId','SeqNum'],how='left')
.drop(['StreamId','SeqNum'], axis=1))
OrderID Symbol Stream SeqNo Timestamp
0 5000000 AXBANK 3 1158 1.490250e+18
1 5000001 AXBANK 6 1733 NaN
2 5000002 AXBANK 6 1244 NaN
3 5000003 AXBANK 6 1388 NaN
4 5000004 AXBANK 3 1389 1.490250e+18
Another solution with columns rename
:
d = {'Stream':'StreamId','SeqNo':'SeqNum'}
print (pd.merge(oss.rename(columns=d), p1, how='left'))
OrderID Symbol StreamId SeqNum Timestamp
0 5000000 AXBANK 3 1158 1.490250e+18
1 5000001 AXBANK 6 1733 NaN
2 5000002 AXBANK 6 1244 NaN
3 5000003 AXBANK 6 1388 NaN
4 5000004 AXBANK 3 1389 1.490250e+18
+2
source to share