Why does column stay in DataFrame index even after deleting it
Consider the following piece of code:
>>> data = pandas.DataFrame({ 'user': [1, 5, 3, 10], 'week': [1, 1, 3, 4], 'value1': [5, 4, 3, 2], 'value2': [1, 1, 1, 2] })
>>> data = data.pivot_table(index='user', columns='week', fill_value=0)
>>> data['target'] = [True, True, False, True]
>>> data
value1 value2 target
week 1 3 4 1 3 4
user
1 5 0 0 1 0 0 True
3 0 3 0 0 1 0 True
5 4 0 0 1 0 0 False
10 0 0 2 0 0 2 True
Now if I call this:
>>> 'target' in data.columns
True
It returns True
as expected. However, why does this return True
?
>>> 'target' in data.drop('target', axis=1).columns
True
How can I drop a column from the table so that it is no longer in the index and the above statement returns False
?
source to share
For now (pandas 0.19.2), the multi-index will keep all used labels in its structure. Deleting a column does not remove its label from the multi-index, and it still mentions it. Check out the long GH element here .
Thus, you have to work around the problem and make assumptions. If you are sure that the labels you are checking are at a specific index level (level 0 in your example), then one way to do it:
'target' in data.drop('target', axis=1).columns.get_level_values(0)
Out[145]: False
If it could be any level, you can use get_values()
and search the entire list:
import itertools as it
list(it.chain.from_iterable(data.drop('target', axis=1).columns.get_values()))
Out[150]: ['value1', 1, 'value1', 3, 'value1', 4, 'value2', 1, 'value2', 3, 'value2', 4]
source to share