How do I select one specific index from each column?
Imagine I have a pandas.Dataframe like:
x = DataFrame({ 'a': [7,6,8,0,2,5],
'b': [3,4,5,6,7,8],
'c': [3,8,5,6,0,1]}, index=[1,2,3,4,5,6])
then I have pandas.Series which gives me, for each key, a specific index that I want to select:
y = Series([4,1,6], index=['a','b','c'])
Is it possible to find these indices in the best pandas way? I want to avoid looping through pandas.Series or pandas.Dataframe and I prefer to use commands like .loc, .query, etc.
source to share
To achieve this, you can use a combination of loc
and np.diagonal
:
In [26]:
np.diagonal(x.loc[y])
Out[26]:
array([0, 3, 1], dtype=int64)
loc
this will search for a line label:
In [27]:
x.loc[y]
Out[27]:
a b c
4 0 6 6
1 7 3 3
6 5 8 1
np.diagonal
returns values in the diagonal.
To make it reliable in column order, we can specifically use values to look up labels and an index on columns to select:
In [30]:
np.diagonal(x.loc[y.values, y.index])
Out[30]:
array([0, 3, 1], dtype=int64)
The above will work even if the columns in y
are different from the column order x
.
source to share