Cumulative Performance on Small Arrays

Question

Cumulative Performance on Small Arrays

I have some code that could use some speedup - unfortunately the bottle neck seems to be in the numpy code already. To my surprise, the function np.sum

, not my logarithms, consume most of the time.

This is an example of a toy:

import numpy as np
from scipy.special import xlogy

p = np.random.rand(20,4)

def many_foo(p):
    for i in range(100000):
        foo(p)

def foo(p):
    p_bar = xlogy(p, p)
    p_sum = p_bar.sum()
    p_sum = np.sum(p_bar)

kern_prof ( %lprun -f foo many_foo(p)

) gives the following line profiling:

Timer unit: 1e-06 s

Total time: 1.43734 s
File: <ipython-input-63-8f7d1e3cad35>
Function: foo at line 10

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    10                                           def foo(p):
    11    100000       259856      2.6     18.1      p_bar = xlogy(p, p)
    12    100000       404256      4.0     28.1      p_sum = p_bar.sum()
    13    100000       773232      7.7     53.8      p_sum = np.sum(p_bar)