Numpy select non null lines

I only want to select rows that don't have any 0 element.

data = np.array([[1,2,3,4,5],
                [6,7,0,9,10],
                [11,12,13,14,15],
                [16,17,18,19,0]])

      

Result:

array([[1,2,3,4,5],
       [11,12,13,14,15]])

      

+3


source to share


2 answers


Use numpy.all

:



>>> data[np.all(data, axis=1)]
array([[ 1,  2,  3,  4,  5],
       [11, 12, 13, 14, 15]])

      

+8


source


You can detect all zeros with data ==0

which will give you a boolean array and then execute np.any

along each line on it. Alternatively, you can detect all non-zeros with data!=0

and then do np.all

to get a string of strings without zero.

Can also be used np.einsum

to replace np.any

, which I personally think is crazy, but in a good way, as it gives us a noticeable performance boost, as we'll confirm later in this solution.

Thus, you will have three approaches listed below.

Approach # 1:

rows_without_zeros = data[~np.any(data==0, axis=1)]

      

Approach # 2:

rows_without_zeros = data[np.all(data!=0, axis=1)]

      



Approach # 3:

rows_without_zeros = data[~np.einsum('ij->i',data ==0)]

      

Runtime tests -

This section explores the three solutions offered in this solution and also includes timings for @ Ashwini Chaudhary's approach , which is also based on np.all

but does not use a mask or boolean array (at least in the frontend).

In [129]: data = np.random.randint(-10,10,(10000,10))

In [130]: %timeit data[np.all(data, axis=1)]
1000 loops, best of 3: 1.09 ms per loop

In [131]: %timeit data[np.all(data!=0, axis=1)]
1000 loops, best of 3: 1.03 ms per loop

In [132]: %timeit data[~np.any(data==0,1)]
1000 loops, best of 3: 1 ms per loop

In [133]: %timeit data[~np.einsum('ij->i',data ==0)]
1000 loops, best of 3: 825 µs per loop

      

So it seems that supplying masks at np.all

or np.any

gives a bit (about ) performance improvement over the asymmetric approach. With help you are looking at improvement over approaches and , which is not bad! 9%

einsum

20%

np.any

np.all

+2


source







All Articles