Pandas: Apply a function through "Column A" while reading "Column B" at the same time
I am using Pandas
to control a function Python
. From inputs.csv
I use each line in "Column A"
as input for a function.
There csv
is also "Column B"
one that contains the values that I want to read into a variable x
inside this function. It shouldn't be apply
from "Column B"
- it should still be made from "Column A"
. Is it possible?
This is the current code that applies the function from "Column A"
:
import pandas as pd
df = pd.read_csv(inputs.csv, delimiter=",")
def function(a):
#variables c, d, e are created here
###I would like to create x from Column B if possible
return pd.Series([c, d, e])
df[["Column C", "Column D", "Column E"]] = df["Column A"].apply(function)
Post-edit: This question has been identified as a possible duplicate of another question . While the answer may be the same, the question is not the same. It is probably not obvious to future readers that apply
two columns are interchangeable with apply
one column and “reading” another column at the same time. Therefore, the question should remain open.
source to share
Yes you are using Series.apply()
, instead you can use - DataFrame.apply()
, with axis=1
to get each row in a function, you can access the columns as - row[<column>]
.
Example -
In [37]: df
Out[37]:
X Y Count
0 0 1 2
1 0 1 2
2 1 1 2
3 1 0 1
4 1 1 2
5 0 0 1
In [38]: def func1(r):
....: print(r['X'])
....: print(r['Y'])
....: return r
....:
In [39]: df.apply(func1,axis=1)
0
1
0
1
1
1
1
0
1
1
0
0
Out[39]:
X Y Count
0 0 1 2
1 0 1 2
2 1 1 2
3 1 0 1
4 1 1 2
5 0 0 1
This is just a very basic example, you can change this to whatever you really want to do.
source to share
The argument axis=1
passed to the apply method puts the entire string in the apply method as one tuple argument.
However, this is much slower than using a single column. I would advise about this if performance is an issue.
def scrape(x):
a, b = x
# Magically create c, d, e from a
print(b)
return pd.Series([c, d, e])
df[["Column C", "Column D", "Column E"]] = df[(['Column A', 'Column B'])].apply(scrape, axis=1)
source to share