Why does column stay in DataFrame index even after deleting it

Consider the following piece of code:

>>> data = pandas.DataFrame({ 'user': [1, 5, 3, 10], 'week': [1, 1, 3, 4], 'value1': [5, 4, 3, 2], 'value2': [1, 1, 1, 2] })
>>> data = data.pivot_table(index='user', columns='week', fill_value=0)
>>> data['target'] = [True, True, False, True]
>>> data
     value1       value2       target
week      1  3  4      1  3  4
user
1         5  0  0      1  0  0   True
3         0  3  0      0  1  0   True
5         4  0  0      1  0  0  False
10        0  0  2      0  0  2   True

      

Now if I call this:

>>> 'target' in data.columns
True

      

It returns True

as expected. However, why does this return True

?

>>> 'target' in data.drop('target', axis=1).columns
True

      

How can I drop a column from the table so that it is no longer in the index and the above statement returns False

?

+3


source to share


1 answer


For now (pandas 0.19.2), the multi-index will keep all used labels in its structure. Deleting a column does not remove its label from the multi-index, and it still mentions it. Check out the long GH element here .

Thus, you have to work around the problem and make assumptions. If you are sure that the labels you are checking are at a specific index level (level 0 in your example), then one way to do it:

'target' in data.drop('target', axis=1).columns.get_level_values(0)
Out[145]: False

      



If it could be any level, you can use get_values()

and search the entire list:

import itertools as it
list(it.chain.from_iterable(data.drop('target', axis=1).columns.get_values()))
Out[150]: ['value1', 1, 'value1', 3, 'value1', 4, 'value2', 1, 'value2', 3, 'value2', 4]

      

+3


source







All Articles