I am using "np.count_nonzero (myarray)> smallvalue" on a bunch of numpy arrays. Can I stop counting halfway through once when "low value" is reached?
The arrays I check are boolean. In this case, np.count_nonzero () seems to be the most efficient way of doing the "sum" . I'm still wondering if there is a way to do this faster, perhaps by doing a greater-than check while counting!
Here is a toy example in which I find my approach (I assume I am using "timeit" and on average more than 100 tests is pretty stupid, but whatever) using a large array, not many small ones, and then the same on a smaller array to demonstrate how much faster it should be:
from timeit import time import numpy as np hugeflatarray=np.ones(100000000, dtype=bool) smallflatarray=np.ones(10, dtype=bool) smallvalue=1 mytimes= for i in range(100): t1=time.clock() np.count_nonzero(hugeflatarray)>smallvalue t2=time.clock() mytimes.append(t2-t1) time for huge array:"+str(np.mean(mytimes))) mytimes= for i in range(100): t1=time.clock() np.count_nonzero(smallflatarray)>smallvalue t2=time.clock() mytimes.append(t2-t1) time for small array:"+str(np.mean(mytimes)))
average time for huge array: 0.0111809413765
average time for small array: 9.83558325865e-07
np.count_nonzero () probably works by looping through the entire array and accumulating values as it goes, right? Wouldn't it be faster if there was a way to stop once the "small value" was reached? "Short circuit".
@ user2357112 After reading your advice, I tried numba's solution and it looks a little faster than count_nonzero (hugearray)> smallvalue! Thank. Here's my solution:
for i in hugearray:
I made this weird "break, THEN return" because numba doesn't seem to support return statements in a for loop, but in practice it doesn't have any effect.
source to share
No one has answered this question yet
See similar questions: