Pandas dataframemerge case insensitive

I am struggling with the simplest way to do case insensitive merge in pandas. Is there a way to do this directly on the merge? Should I use (? I) or regex with ignorecase? In my code snippet below, I am joining some countries where it might be "United States" in one file and "UNITED STATES" in another, and I just want to eliminate that from the equation. Thank!

import pandas as pd
import csv
import sys

env_path = sys.argv[1]
map_path = sys.argv[2]


df_address = pd.read_csv(env_path + "\\address.csv")
df_CountryMapping = pd.read_csv(map_path + "\CountryMapping.csv")

df_merged = df_address.merge(df_CountryMapping, left_on="Country", right_on="NAME", how="left")

....

      

+5


source to share


3 answers


Downgrade the values ​​in the two columns to be used for the concatenation and then merge across the lowercase columns



df_address['country_lower'] = df_address['Country'].str.lower()
df_CountryMapping['name_lower'] = df_CountryMapping['NAME'].str.lower()
df_merged = df_address.merge(df_CountryMapping, left_on="country_lower", right_on="name_lower", how="left")

      

+7


source


I suggest decreasing the column names after reading them

df_address.columns=[c.lower() for c in df_address.columns]
df_CountryMapping.columns=[c.lower() for c in df_CountryMapping.columns]

      

Then update the values



df_address['country']=df_address['country'].str.lower()
df_CountryMapping['name']=df_CountryMapping['name'].str.lower()

      

And only then merge

df_merged = df_address.merge(df_CountryMapping, left_on="country", right_on="name", how="left")

      

+1


source


One solution would be to convert the column names of both data frames to lowercase. So something like this:

df_address = pd.read_csv(env_path + "\\address.csv")
df_CountryMapping = pd.read_csv(map_path + "\CountryMapping.csv")

df_address.rename(columns=lambda x: x.lower(), inplace=True)
df_CountryMapping.rename(columns=lambda x: x.lower(), inplace=True)

df_merged = df_address.merge(df_CountryMapping, left_on="country", right_on="name", how="left")

      

+1


source







All Articles