Get column header based on value in each row
I have a pandas dataframe, something like below (just illustration):
import datetime todays_date = datetime.datetime.now().date() index = pd.date_range(todays_date-datetime.timedelta(10), periods=2, freq='D') columnheader=['US', 'Canada', 'UK', 'Japan'] data=np.array([[3,4,2,1],[1,4,3,2]]) df = pd.DataFrame(data, index=index, columns=columnheader)
Result:
US Canada UK Japan
2015-07-26 3 4 2 1
2015-07-27 1 4 3 2
I need to find a column header that has a value of 1 and 2 for each row.
so i have to get
['Japan', 'UK'] ['US', 'Japan']
+3
source to share
1 answer
You can do the following, this checks each row for membership 1,2
with isin
, and if so generates a boolean series, you can use it to index on the columns by calling apply
again, we convert this to a list because the sizes won't be aligned if you do this do not:
In [191]:
df.apply(lambda x: x.isin([1,2]), axis=1).apply(lambda x: list(df.columns[x]), axis=1)
Out[191]:
2015-07-26 [UK, Japan]
2015-07-27 [US, Japan]
Freq: D, dtype: object
output from internal apply
:
In [192]:
df.apply(lambda x: x.isin([1,2]), axis=1)
Out[192]:
US Canada UK Japan
2015-07-26 False False True True
2015-07-27 True False False True
EDIT
If you want to preserve order, you can define func to test each value and return that as a series:
In [209]:
filter_vals=[1,2]
def func(x):
l=[]
for val in filter_vals:
for col in df:
if x[col] == val:
l.append(col)
β
return pd.Series(l)
df.apply(func, axis=1)
Out[209]:
0 1
2015-07-26 Japan UK
2015-07-27 US Japan
+1
source to share