Conditional on pandas DataFrame's

Let df1

, df2

and df3

- pandas.DataFrame

, having the same structure, but different numerical values. I want to execute:

res=if df1>1.0: (df2-df3)/(df1-1) else df3

      

res

must have the same structure as df1

, df2

and df3

.

numpy.where()

generates the result as a flat array.

Edit 1:

res

must have the same indices as df1

, df2

and df3

.

For example, I can access df2

how df2["instanceA"]["parameter1"]["paramter2"]

. I want to access a new computed DataFrame / Series res

as res["instanceA"]["parameter1"]["paramter2"]

.

+3


source to share


3 answers


numpy.where

Should actually work fine. The output signal here is 4x2 (same as df1, df2, df3).

df1 = pd.DataFrame( np.random.randn(4,2), columns=list('xy') )
df2 = pd.DataFrame( np.random.randn(4,2), columns=list('xy') )
df3 = pd.DataFrame( np.random.randn(4,2), columns=list('xy') )

res = df3.copy()
res[:] = np.where( df1 > 1, (df2-df3)/(df1-1), df3 )

          x         y
0 -0.671787 -0.445276
1 -0.609351 -0.881987
2  0.324390  1.222632
3 -0.138606  0.955993

      

Note that this should work with both series and datafiles. [:]

is a shorthand syntax that preserves index and columns. Without it, it res

will come out as an array, not a series or dataframe.

Alternatively, for a series you could write, as @Kadir does in his answer:



res = pd.Series(np.where( df1>1, (df2-df3)/(df1-1), df3 ), index=df1.index)

      

Or similarly for a dataframe, you can write:

res = pd.DataFrame(np.where( df1>1, (df2-df3)/(df1-1), df3 ), index=df1.index,
                                                              columns=df1.columns)

      

+2


source


Integrating the idea into this question in JohnE's answer, I came up with this solution:

res = pd.Series(np.where( df1 > 1, (df2-df3)/(df1-1), df3 ), index=df1.index)

      



Best answer using DataFrames would be appreciated.

+1


source


Let's say df is your original dataframe and res is your new column. Use a combination of setting values ​​and logical indexing.

Install res as a copy of df3:

 df['res'] = df['df3']

      

Then adjust the values ​​for your condition.

df[df['df1']>1.0]['res'] = (df['df2'] - df['df3'])/(df['df1']-1)

      

0


source







All Articles