How to create a dictionary of key: column_name and value: unique values ββin a column in python from a dataframe
I am trying to create a dictionary of key: value pairs where key is the name of a dataframe column and value will be a list containing all the unique values ββin that column. Ultimately, I want to be able to filter out the key_value pair from the dict based on conditions. This is what I have been able to do so far:
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
product_list = ['shoe','footwear']
color_list = []
size_list = []
Here product, color, size are all column names, and dict keys should be named accordingly as color_list, etc. Ultimately I will need to access every key: value_list in the dictionary. Expected Result:
KEY VALUE
color_list : ["red","blue","black"]
size_list: ["9","XL","32","10 inches"]
Can someone please help me on this? A snapshot of the data is attached.
source to share
With the DataFrame
following:
import pandas as pd
df = pd.DataFrame([["Women", "Slip on", 7, "Black", "Clarks"], ["Women", "Slip on", 8, "Brown", "Clarcks"], ["Women", "Slip on", 7, "Blue", "Clarks"]], columns= ["Category", "Sub Category", "Size", "Color", "Brand"])
print(df)
Output:
Category Sub Category Size Color Brand
0 Women Slip on 7 Black Clarks
1 Women Slip on 8 Brown Clarcks
2 Women Slip on 7 Blue Clarks
You can convert your DataFrame to dict and create your new dict when mapping DataFrame columns like this example:
new_dict = {"color_list": list(df["Color"]), "size_list": list(df["Size"])}
# OR:
#new_dict = {"color_list": [k for k in df["Color"]], "size_list": [k for k in df["Size"]]}
print(new_dict)
Output:
{'color_list': ['Black', 'Brown', 'Blue'], 'size_list': [7, 8, 7]}
To have unique values, you can use set
as in this example:
new_dict = {"color_list": list(set(df["Color"])), "size_list": list(set(df["Size"]))}
print(new_dict)
Output:
{'color_list': ['Brown', 'Blue', 'Black'], 'size_list': [8, 7]}
Or, like what @Ami Tavory said in his answer, to have all the unique keys and values ββfrom your DataFrame, you can simply do this:
new_dict = {k:list(df[k].unique()) for k in df.columns}
print(new_dict)
Output:
{'Brand': ['Clarks', 'Clarcks'],
'Category': ['Women'],
'Color': ['Black', 'Brown', 'Blue'],
'Size': [7, 8],
'Sub Category': ['Slip on']}
source to share
I am trying to create a dictionary of key: value pairs where key is the name of a column of a dataframe and value will be a list containing all the unique values ββin that column.
You can use simple understanding for this.
Let's say you start with
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 1], 'b': [1, 4, 5]})
Then the following understanding resolves this:
>>> {c: list(df[c].unique()) for c in df.columns}
{'a': [1, 2], 'b': [1, 4, 5]}
source to share
If I understand your question correctly, you may need set
a list instead. Probably in this piece of code you can add set
to get the unique values ββof a given list.
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
list_name = set(list_name)
Usage example
>>> a_list = [7, 8, 7, 9, 10, 9]
>>> set(a_list)
{8, 9, 10, 7}
source to share
This is how I did it, let me know if it helps
import pandas as pd
df = pd.read_csv("/path/to/csv/file")
colList = list(df)
dic = {}
for x in colList:
_list = []
_list.append(list(set(list(df[x]))))
list_name = ''.join([str(x), '_list'])
dic[str(x)+"_list"] = _list
print dic
Output:
{'Color_list': [['Blue', 'Orange', 'Black', 'Red']], 'Size_list': [['9', '8', '10 inches', 'XL', '7']], 'Brand_list': [['Clarks']], 'Sub_list': [['SO', 'FOR']], 'Category_list': [['M', 'W']]}
MyCsv File
Category,Sub,Size,Color,Brand
W,SO,7,Blue,Clarks
W,SO,7,Blue,Clarks
W,SO,7,Black,Clarks
W,SO,8,Orange,Clarks
W,FOR,8,Red,Clarks
M,FOR,9,Black,Clarks
M,FOR,10 inches,Blue,Clarks
M,FOR,XL,Blue,Clarks
source to share