Vectorization or optimization of a loop in which each iteration depends on the state of the previous iteration

I have an algorithm that I am implementing in python. The algorithm can be executed 1,000,000 times, so I want to optimize it as much as possible. The algorithm is based on three lists ( energy

, point

and valList

) and two counters p

and e

.

The two lists energy

, and point

contains a number from 0 to 1, where I make decisions. p

- point counter, and e

- energy counter. I can trade in points for enery, and the cost of each energy is determined in valList

(depends on time). I can also trade in a different way. But I need to trade right away.

Algorithm diagram:

  • Get a boolean list where items in are energy

    above a threshold and items in point

    are below another threshold. This is the decision to trade energy for points. Get the appropriate list for the point that gives the solution to trade points for energy.
  • In each of the boolean lists. Remove all true values ​​that appear after another true value (if I have all the points trading for energy, I cannot repeat this again)
  • For each pair of elements ( pB

    , point bool and eB

    , energy bool) from two logical lists: If pB is true and I have points, I want to exchange all my points for enery. If it is eB

    true and I have energy, I want to exchange all my energy for points.

This is the implementation I came across:

start = time.time()
import numpy as np

np.random.seed(2) #Seed for deterministic result, just for debugging

topLimit = 0.55
bottomLimit = 0.45

#Generate three random arrays, will not be random in the real world
res = np.random.rand(500,3) #Will probably not be much longer than 500
energy = res[:,0]        
point = res[:,1]
valList = res[:,2]

#Step 1:
#Generate two bools that (for ex. energy) is true when energy is above a threashold
#and point below another threshold). The opposite applies to point
energyListBool = ((energy > topLimit) & (point < bottomLimit))
pointListBool = ((point > topLimit) & (energy < bottomLimit))

#Step 2:
#Remove all 'true' that comes after another true since this is not valid
energyListBool[1:] &= energyListBool[1:] ^ energyListBool[:-1]
pointListBool[1:] &= pointListBool[1:] ^ pointListBool[:-1]

p = 100
e = 0

#Step 3:
#Loop through the lists, if point is true, I loose all p but gain p/valList[i] for e
#If energy is true I loose all e but gain valList[i]*e for p
for i in range(len(energyListBool)):
    if pointListBool[i] and e == 0:
        e = p/valList[i] #Trade all points to energy
        p = 0
    elif energyListBool[i] and p == 0:
        p = valList[i]*e #Trade all enery to points
        e = 0

print('p = {0} (correct for seed 2: 3.1108006690739174)'.format(p))
print('e = {0} (correct for seed 2: 0)'.format(e))

end = time.time()
print(end - start)

      

What I am struggling with is how (if it can be done) to vectorize the for loop, so I can use that instead of the for loop, which I think is likely to be faster.

+3


source to share


1 answer


Within the current setup, a problem that is not possible as vectorization essentially requires that your n

-th step of computation should not depend on previous steps n-1

. Sometimes, however, one can find the so-called "closed form" of repetition f(n) = F(f(n-1), f(n-2), ... f(n-k))

, i.e. Find an explicit expression for f(n)

that does not depend on n

, but this is a separate research problem.



Moreover, from an algorithmic point of view, such vectorization will not do much, since the complexity of your algorithm will be C*n = O(n)

. However, since the "complexity constant" C

matters in practice, there are various ways to reduce it. For example, it shouldn't be a big problem to rewrite the critical loop in C / C ++.

+1


source







All Articles