Cython Partial Derivative

I have a python script where, as part of an evolutionary optimization algorithm, I am evaluating partial derivatives many thousands of times. I made a profile line by line and this partial production calculation takes up most of the execution time. I am using scipy.optimize.approx_fprime

to calculate partial derivatives and I tried to rewrite it in cython without much success.

The line profile is shown below. My cythonized version is scipy.optimize.approx_fprime

just called approx_fprime

.

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
84                                           @profile
100      1500     14889652   9926.4     25.3      df1 = approx_fprime(inp_nom,evaluate1,epsilon)
101      1500     14939889   9959.9     25.4      df2 = scipy.optimize.approx_fprime(inp_upp,evaluate1,epsilon)

      

Below is my cython file.

import numpy as np
cimport numpy as np
cimport cython
@cython.boundscheck(False) # turn of bounds-checking for entire function
def approx_fprime(np.ndarray xk, f, double epsilon, *args):
    # From scipy.optimize.approx_fprime
    f0 = f(*((xk,) + args))
    cdef np.ndarray grad = np.zeros((len(xk),), float)
    cdef np.ndarray ei = np.zeros((len(xk),), float)
    cdef np.ndarray d = epsilon * ei
    for k in xrange(len(xk)):
        ei[k] = 1.0
        grad[k] = (f(*((xk + d,) + args)) - f0) / d[k]
        ei[k] = 0.0
    return grad

      

I tried to include all relevant type declarations and make sure it plays fine with numpy. Ultimately, however, the proof is in the pudding, as they say. This version is actually not much faster than the lean version. The function only has a few variables, so it is not a huge computation and there is probably only room for incremental improvement in one iteration. However, the function is called over and over because it is used in an evolutionary optimization algorithm, and so I expect / hope that the multiplied performance gain multiplied many times will have a big payoff.

Can a cython expert take a look at this code and help me figure out if I'm on the right track or if this is just a crazy assignment?

Thank!

+3


source to share


1 answer


The first thing to notice is that code optimization is about finding bottlenecks in your code. There are usually multiple functions, loops, etc., which consume most of the time. These are the right candidates for optimization. So, most importantly: Evaluate the efficiency of your code with a profiler .

The first thing to optimize for your Python code is to go through the code line by line and check each line if new objects are created. This is because object creation is extremely expensive compared to simple arithmetic . Rule of thumb: Try to avoid creating an object whenever possible. But make sure you are not creating a new object in critical time loops.

Take a look f*((xk + d,) + args)

. This is perfectly fine Python code, but not useful if you need high performance. It will create a new tuple of arguments at each step of the loop. An overwrite that doesn't create any objects is likely to give you a huge performance boost.



The next step is to start statically installing. Make sure you enter whatever is used in your loops. Typing k

will probably bring you a lot.

Then you can try to optimize even more by disabling boundscheck

etc.

Most importantly, do your optimizations iteratively and test your performance by profiling your code. Most of the time, it is not easy to figure out what the bottleneck in your code really is. Profiling will give you clues: if the optimization doesn't do you much, you probably missed a bottleneck.

0


source







All Articles