"Where is the clause" in a list in pandas Dataframe

I have a pandas Datamframe type named df:

     email        | list
___________________________
email1@email.com  | [0,1]
email1@email.com  | [2,1]
email1@email.com  | [0,3]
email1@email.com  | [0,0]
email1@email.com  | [0,1]

      

I want to get the whole string from df having list 0: [0,0]

I am doing:

df2 = df[df['list'] == [0,0]]

      

But I am getting the following error:

ValueError: Arrays were different lengths: 5 vs 2

      

+3


source to share


2 answers


The reason it doesn't work:

df2 = df[df['list'] == [0, 0]]

      

is that df ['list'] is a 5-element long list and it [0, 0]

is a long list of two elements. Error while evaluating your mask

df['list'] == [0, 0]

      

Correct solution updated



I believe the fastest way to solve this is to create a series of [0,0] elements of the length of your dataframe and compare that series with your column

df['list'] == pd.Series([[0, 0]] * len(df))

0    False
1    False
2    False
3    True
4    False

      

This creates a mask by comparing each item in the list with [0, 0]

instead of comparing the list df['list']

with[0, 0]

Using this mask you can create a new dataframe

mask = df['list'] == pd.Series([[0, 0]] * len(df))
df2 = df[mask]

      

+4


source


mapping a list of lists to a single record. You should instead filter df using iterrows()

. iterrows()

creates a generator that yields tuples whose second entry is a column dictionary. you can loop through them and match against them and then create a new dataframe.

df2 = {'email':[], 'list':[]}
for row in df.iterrows():
    row_dictionary = row[1]
    if row_dictionary['list'] == [0,0]:
        for key in df2.keys():
            df2[key].append(row_dictionary[key])
df2 = pandas.DataFrame.from_dict(df2)

      



By using dictionary keys to fill it, you can use this method on any data frame.

+1


source







All Articles