"Where is the clause" in a list in pandas Dataframe
I have a pandas Datamframe type named df:
email | list
___________________________
email1@email.com | [0,1]
email1@email.com | [2,1]
email1@email.com | [0,3]
email1@email.com | [0,0]
email1@email.com | [0,1]
I want to get the whole string from df having list 0: [0,0]
I am doing:
df2 = df[df['list'] == [0,0]]
But I am getting the following error:
ValueError: Arrays were different lengths: 5 vs 2
source to share
The reason it doesn't work:
df2 = df[df['list'] == [0, 0]]
is that df ['list'] is a 5-element long list and it [0, 0]
is a long list of two elements. Error while evaluating your mask
df['list'] == [0, 0]
Correct solution updated
I believe the fastest way to solve this is to create a series of [0,0] elements of the length of your dataframe and compare that series with your column
df['list'] == pd.Series([[0, 0]] * len(df))
0 False
1 False
2 False
3 True
4 False
This creates a mask by comparing each item in the list with [0, 0]
instead of comparing the list df['list']
with[0, 0]
Using this mask you can create a new dataframe
mask = df['list'] == pd.Series([[0, 0]] * len(df))
df2 = df[mask]
source to share
mapping a list of lists to a single record. You should instead filter df using iterrows()
. iterrows()
creates a generator that yields tuples whose second entry is a column dictionary. you can loop through them and match against them and then create a new dataframe.
df2 = {'email':[], 'list':[]}
for row in df.iterrows():
row_dictionary = row[1]
if row_dictionary['list'] == [0,0]:
for key in df2.keys():
df2[key].append(row_dictionary[key])
df2 = pandas.DataFrame.from_dict(df2)
By using dictionary keys to fill it, you can use this method on any data frame.
source to share