Pandas copy values from another dataframe
Pandas dataframe df1 contains list of A values
df1 = pd.DataFrame({'A':['a','a','b']})
A
0 a
1 a
2 b
Dataframe df2 can be thought of as mapping values from A to values in B
df2 = pd.DataFrame({'A':['a','b'], 'B':[2,3]})
A B
0 a 2
1 b 3
I want to apply mapping to df1. I have a working version but I feel there is potential for improvement as I believe my solution is unreadable and I'm not sure how it would generalize to multiindexes
df2.set_index('A').loc[df1.set_index('A').index].reset_index()
A B
0 a 2
1 a 2
2 b 3
I could also convert df2 to a dictionary and use the replace method, but it also doesn't convince me.
source to share
There is a function for this map
that takes a dict or series, in the latter it uses the index to do the search
In [94]:
df1['A'].map(df2.set_index('A')['B'])
Out[94]:
0 2
1 2
2 3
Name: A, dtype: int64
In [93]:
%timeit df1['A'].map(df2.set_index('A')['B'])
%timeit df1.merge(df2, on='A')
1000 loops, best of 3: 718 µs per loop
1 loops, best of 3: 1.31 ms per loop
In your test data it is map
almost 2x faster, I would expect this to be true for big data, just like cython optimizations and doesn't need to do as much validation as merge
.
source to share
you can use pd.merge()
In [149]: df1.merge(df2, on='A')
Out[149]:
A B
0 a 2
1 a 2
2 b 3
Documentation: pandas.DataFrame.merge ()
source to share