KeyError: False in pandas dataframe

import pandas as pd

businesses = pd.read_json(businesses_filepath, lines=True, encoding='utf_8')
restaurantes = businesses['Restaurants' in businesses['categories']]

      

I would like to delete rows that have no Restaurants columns in the Category column and that column has lists, however I gave an error "KeyError: False" and I would like to understand why and how to solve.

+3


source to share


3 answers


The expression 'Restaurants' in businesses['categories']

returns a boolean value False

. This is passed to the enterprise brace indexing operator DataFrame, which does not contain a False column and thus raises a KeyError.

What you want to do is what's called logical indexing, which works like this.



businesses[businesses['categories'] == 'Restaurants']

      

+6


source


I think what you meant:

businesses = businesses.loc[businesses['categories'] == 'Restaurants']

      



which will only contain rows with category restaurants

+1


source


If you find that your data contains spelling variations or alternative terms related to a restaurant, the following may be helpful. Essentially, you are setting your restaurant's terms and conditions restuarant_lst

. The function lambda

returns true

if any of the elements restaurant_lst

in each line of the business series. The indexer .loc

filters the strings that are returned false

for the function lambda

.

restaurant_lst = ['Restaurant','restaurantes','diner','bistro']
restaurant = businesses.loc[businesses.apply(lambda x: any(restaurant_str in x for restaurant_str in restaurant_lst))]

      

0


source







All Articles