Numpy select non null lines

I only want to select rows that don't have any 0 element.

data = np.array([[1,2,3,4,5],






source to share

2 answers

Use numpy.all


>>> data[np.all(data, axis=1)]
array([[ 1,  2,  3,  4,  5],
       [11, 12, 13, 14, 15]])




You can detect all zeros with data ==0

which will give you a boolean array and then execute np.any

along each line on it. Alternatively, you can detect all non-zeros with data!=0

and then do np.all

to get a string of strings without zero.

Can also be used np.einsum

to replace np.any

, which I personally think is crazy, but in a good way, as it gives us a noticeable performance boost, as we'll confirm later in this solution.

Thus, you will have three approaches listed below.

Approach # 1:

rows_without_zeros = data[~np.any(data==0, axis=1)]


Approach # 2:

rows_without_zeros = data[np.all(data!=0, axis=1)]


Approach # 3:

rows_without_zeros = data[~np.einsum('ij->i',data ==0)]


Runtime tests -

This section explores the three solutions offered in this solution and also includes timings for @ Ashwini Chaudhary's approach , which is also based on np.all

but does not use a mask or boolean array (at least in the frontend).

In [129]: data = np.random.randint(-10,10,(10000,10))

In [130]: %timeit data[np.all(data, axis=1)]
1000 loops, best of 3: 1.09 ms per loop

In [131]: %timeit data[np.all(data!=0, axis=1)]
1000 loops, best of 3: 1.03 ms per loop

In [132]: %timeit data[~np.any(data==0,1)]
1000 loops, best of 3: 1 ms per loop

In [133]: %timeit data[~np.einsum('ij->i',data ==0)]
1000 loops, best of 3: 825 µs per loop


So it seems that supplying masks at np.all

or np.any

gives a bit (about ) performance improvement over the asymmetric approach. With help you are looking at improvement over approaches and , which is not bad! 9%







All Articles