KeyError: False in pandas dataframe
import pandas as pd
businesses = pd.read_json(businesses_filepath, lines=True, encoding='utf_8')
restaurantes = businesses['Restaurants' in businesses['categories']]
I would like to delete rows that have no Restaurants columns in the Category column and that column has lists, however I gave an error "KeyError: False" and I would like to understand why and how to solve.
source to share
The expression 'Restaurants' in businesses['categories']
returns a boolean value False
. This is passed to the enterprise brace indexing operator DataFrame, which does not contain a False column and thus raises a KeyError.
What you want to do is what's called logical indexing, which works like this.
businesses[businesses['categories'] == 'Restaurants']
source to share
If you find that your data contains spelling variations or alternative terms related to a restaurant, the following may be helpful. Essentially, you are setting your restaurant's terms and conditions restuarant_lst
. The function lambda
returns true
if any of the elements restaurant_lst
in each line of the business series. The indexer .loc
filters the strings that are returned false
for the function lambda
.
restaurant_lst = ['Restaurant','restaurantes','diner','bistro']
restaurant = businesses.loc[businesses.apply(lambda x: any(restaurant_str in x for restaurant_str in restaurant_lst))]
source to share