Pandas - select column based on cell value
I have a DataFrame of currency values ββin GBP by date:
USD EUR JPY CAD CNH
date
2017-06-29 0.770151 0.879765 0.006857 0.591919 0.113538
2017-06-28 0.773038 0.878451 0.006892 0.592764 0.113687
2017-06-27 0.781594 0.885069 0.006952 0.593128 0.114724
2017-06-26 0.785476 0.879456 0.007045 0.593763 0.114849
And another DataFrame containing values ββto convert to GBP:
price date currency
0 21404.00000 2017-06-26 USD
3 21445.00000 2017-06-26 USD
4 1.27213 2017-06-26 USD
5 111.67500 2017-06-26 JPY
6 1.27205 2017-06-26 EUR
8 111.65500 2017-06-26 JPY
9 111.65500 2017-06-26 JPY
My goal is to "concatenate" the DataFrame's currency to the values, resulting in a new column being created rate
:
price date currency rate
0 21404.00000 2017-06-26 USD 0.785476
3 21445.00000 2017-06-26 USD 0.785476
4 1.27213 2017-06-27 USD 0.781594
5 111.67500 2017-06-27 JPY 0.006952
6 1.27205 2017-06-28 EUR 0.885069
8 111.65500 2017-06-28 JPY 0.006892
9 111.65500 2017-06-29 JPY 0.006857
I am not interested in any method using apply
or any other type of string based iteration. Instead, I would look for a vectorized way to manipulate the first two DataFrames to get the third.
source to share
Use lookup
to find your rates and assign
to add a new column
df.assign(rate=currencies.lookup(df.date, df.currency))
price date currency rate
0 21404.00000 2017-06-26 USD 0.785476
3 21445.00000 2017-06-26 USD 0.785476
4 1.27213 2017-06-27 USD 0.781594
5 111.67500 2017-06-27 JPY 0.006952
6 1.27205 2017-06-28 EUR 0.878451
8 111.65500 2017-06-28 JPY 0.006892
9 111.65500 2017-06-29 JPY 0.006857
source to share
well, it's a bit of a mess. But you can use the apply function to match the date and then get the column you want, like
currency_data = [['2017-06-29',0.770151,0.879765,0.006857,0.591919,0.113538],
['2017-06-28',0.773038,0.878451,0.006892,0.592764,0.113687]]
currencies = pd.DataFrame(currency_data,
columns=['date', 'USD', 'EUR', 'JPY', 'CAD', 'CNH'])
trans_data = [[21404.00000, '2017-06-29', 'USD'],
[21445.00000, '2017-06-28', 'JPY']]
trans = pd.DataFrame(trans_data, columns=['amount', 'date', 'currency'])
def map_currency(row):
return currencies[currencies['date'] == row['date']][row['currency']].iloc[0]
rates = trans.apply(map_currency, axis=1)
rates.name = 'rate'
pd.concat([trans, rates], axis=1)
source to share