Optimizing python code

I wrote the following function to estimate orientation from a 3-axis accelerometer signal (X, Y, Z)

X.shape
Out[4]: (180000L,)
Y.shape
Out[4]: (180000L,)
Z.shape
Out[4]: (180000L,)

def estimate_orientation(self,X,Y,Z):

    sigIn=np.array([X,Y,Z]).T
    N=len(sigIn)
    sigOut=np.empty(shape=(N,3))
    sigOut[sigOut==0]=None
    i=0
    while i<N:
        sigOut[i,:] = np.arccos(sigIn[i,:]/np.linalg.norm(sigIn[i,:]))*180/math.pi
        i=i+1

    return sigOut

      

It takes quite a long time (~ 2.2 seconds) to execute this function with a signal of 180,000 samples ... I know it is not written in the "pythonic way" ... Could you help me optimize the execution time?

Thank!

+3


source to share


1 answer


Initial approach

One approach following usage broadcasting

would be like this:

np.arccos(sigIn/np.linalg.norm(sigIn,axis=1,keepdims=1))*180/np.pi

      

Further optimization - I

We could use np.einsum

to replace the part np.linalg.norm

. Thus:

np.linalg.norm(sigIn,axis=1,keepdims=1)

      

can be replaced with:

np.sqrt(np.einsum('ij,ij->i',sigIn,sigIn))[:,None]

      

Further Optimization - II



Further enhancement can be caused by numexpr

module
, which works great with huge arrays and with operations on functions.In trigonometrical

our case, that would be arcccos

. So, we will use the part einsum

that is used in the previous optimization section and then use arccos

from numexpr

on it.

Thus, the implementation will look something like this:

import numexpr as ne

pi_val = np.pi
s = np.sqrt(np.einsum('ij,ij->i',signIn,signIn))[:,None]
out = ne.evaluate('arccos(signIn/s)*180/pi_val')

      

Runtime test

Approaches -

def original_app(sigIn):
    N=len(sigIn)
    sigOut=np.empty(shape=(N,3))
    sigOut[sigOut==0]=None
    i=0
    while i<N:
        sigOut[i,:] = np.arccos(sigIn[i,:]/np.linalg.norm(sigIn[i,:]))*180/math.pi
        i=i+1
    return sigOut

def broadcasting_app(signIn):
    s = np.linalg.norm(signIn,axis=1,keepdims=1)
    return np.arccos(signIn/s)*180/np.pi

def einsum_app(signIn):
    s = np.sqrt(np.einsum('ij,ij->i',signIn,signIn))[:,None]
    return np.arccos(signIn/s)*180/np.pi

def numexpr_app(signIn):
    pi_val = np.pi
    s = np.sqrt(np.einsum('ij,ij->i',signIn,signIn))[:,None]
    return ne.evaluate('arccos(signIn/s)*180/pi_val')

      

Timing -

In [115]: a = np.random.rand(180000,3)

In [116]: %timeit original_app(a)
     ...: %timeit broadcasting_app(a)
     ...: %timeit einsum_app(a)
     ...: %timeit numexpr_app(a)
     ...: 
1 loops, best of 3: 1.38 s per loop
100 loops, best of 3: 15.4 ms per loop
100 loops, best of 3: 13.3 ms per loop
100 loops, best of 3: 4.85 ms per loop

In [117]: 1380/4.85 # Speedup number
Out[117]: 284.5360824742268

      

280x

acceleration there!

+6


source







All Articles