Using np.where but keeping exisitng values if condition is False
I love np.where but still haven't managed to fully grab it.
I have a dataframe lets say it looks like this:
import pandas as pd
import numpy as np
from numpy import nan as NA
DF = pd.DataFrame({'a' : [ 3, 0, 1, 0, 1, 14, 2, 0, 0, 0, 0],
'b' : [ 3, 0, 1, 0, 1, 14, 2, 0, 0, 0, 0],
'c' : [ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'd' : [5, 1, 2 ,1, 1 ,22, 30, 1, 0, 0, 0]})
Now what I want to do is replace the 0 values with NaN values when all row values are zero. Critically, I want to support any other values in a string in cases where all string values are non-zero.
I want to do something like this:
cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
DF[col] = np.where(condition, NA, ???)
I put??? to indicate that I don't know what value is there, if this condition is false, I just want to keep what is already there. Is this possible with np.where or should I use a different technique?
source to share
There is a method for this task pandas.Series
(by the way where
). Seems a bit backward at first, but from the documentation.
Series.where (cond, other = nan, inplace = False, axis = None, level = None, try_cast = False, raise_on_error = True)
Return an object of the same form as self and whose corresponding entries are from self, where cond is True, otherwise from others.
So your example would become
cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
DF[col].where(~condition, np.nan, inplace=True)
But if all you are trying to do is replace the rows of all zeros for a specific set of columns with NA
, you can do this instead
DF.loc[condition, cols] = NA
EDIT
To answer the original question np.where
follows the same broadcasting rules as other array operations, so you should replace ???
with DF[col]
, changing your example to:
cols = ['a', 'b', 'c', 'd']
condition = (DF[cols] == 0).all(axis=1)
for col in cols:
DF[col] = np.where(condition, NA, DF[col])
source to share
You can do something like this:
array_binary = np.where(array[i]<threshold,0,1)
array_sparse = np.multiply(array_binary,np.ones_like(array))
perform elementwise multiplication of a binary array and an array of ones using np.multiply. Hence, non-zero elements will be restored / saved. array_sparse is a rare version of an array
source to share