Using np.where but keeping exisitng values ​​if condition is False

I love np.where but still haven't managed to fully grab it.

I have a dataframe lets say it looks like this:

import pandas as pd
import numpy as np
from numpy import nan as NA
DF = pd.DataFrame({'a' : [ 3, 0, 1, 0, 1, 14, 2, 0, 0, 0, 0],
                   'b' : [ 3, 0, 1, 0, 1, 14, 2, 0, 0, 0, 0],
                   'c' : [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   'd' : [5, 1, 2 ,1, 1 ,22, 30, 1, 0, 0, 0]})

      

Now what I want to do is replace the 0 values ​​with NaN values ​​when all row values ​​are zero. Critically, I want to support any other values ​​in a string in cases where all string values ​​are non-zero.

I want to do something like this:

cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
    DF[col] = np.where(condition, NA, ???)

      

I put??? to indicate that I don't know what value is there, if this condition is false, I just want to keep what is already there. Is this possible with np.where or should I use a different technique?

+6


source to share


2 answers


There is a method for this task pandas.Series

(by the way where

). Seems a bit backward at first, but from the documentation.

Series.where (cond, other = nan, inplace = False, axis = None, level = None, try_cast = False, raise_on_error = True)

Return an object of the same form as self and whose corresponding entries are from self, where cond is True, otherwise from others.

So your example would become

cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
    DF[col].where(~condition, np.nan, inplace=True)

      

But if all you are trying to do is replace the rows of all zeros for a specific set of columns with NA

, you can do this instead



DF.loc[condition, cols] = NA

      

EDIT

To answer the original question np.where

follows the same broadcasting rules as other array operations, so you should replace ???

with DF[col]

, changing your example to:

cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
    DF[col] = np.where(condition, NA, DF[col])

      

+9


source


You can do something like this:

    array_binary = np.where(array[i]<threshold,0,1)
    array_sparse = np.multiply(array_binary,np.ones_like(array))

      



perform elementwise multiplication of a binary array and an array of ones using np.multiply. Hence, non-zero elements will be restored / saved. array_sparse is a rare version of an array

0


source







All Articles