Pandas dataframemerge case insensitive
I am struggling with the simplest way to do case insensitive merge in pandas. Is there a way to do this directly on the merge? Should I use (? I) or regex with ignorecase? In my code snippet below, I am joining some countries where it might be "United States" in one file and "UNITED STATES" in another, and I just want to eliminate that from the equation. Thank!
import pandas as pd
import csv
import sys
env_path = sys.argv[1]
map_path = sys.argv[2]
df_address = pd.read_csv(env_path + "\\address.csv")
df_CountryMapping = pd.read_csv(map_path + "\CountryMapping.csv")
df_merged = df_address.merge(df_CountryMapping, left_on="Country", right_on="NAME", how="left")
....
+5
source to share
3 answers
Downgrade the values ββin the two columns to be used for the concatenation and then merge across the lowercase columns
df_address['country_lower'] = df_address['Country'].str.lower()
df_CountryMapping['name_lower'] = df_CountryMapping['NAME'].str.lower()
df_merged = df_address.merge(df_CountryMapping, left_on="country_lower", right_on="name_lower", how="left")
+7
source to share
I suggest decreasing the column names after reading them
df_address.columns=[c.lower() for c in df_address.columns]
df_CountryMapping.columns=[c.lower() for c in df_CountryMapping.columns]
Then update the values
df_address['country']=df_address['country'].str.lower()
df_CountryMapping['name']=df_CountryMapping['name'].str.lower()
And only then merge
df_merged = df_address.merge(df_CountryMapping, left_on="country", right_on="name", how="left")
+1
source to share
One solution would be to convert the column names of both data frames to lowercase. So something like this:
df_address = pd.read_csv(env_path + "\\address.csv")
df_CountryMapping = pd.read_csv(map_path + "\CountryMapping.csv")
df_address.rename(columns=lambda x: x.lower(), inplace=True)
df_CountryMapping.rename(columns=lambda x: x.lower(), inplace=True)
df_merged = df_address.merge(df_CountryMapping, left_on="country", right_on="name", how="left")
+1
source to share