Multiprocess AttributeError object has no attribute '__path__'

I have a long script that at the end should run a function for all the elements of a huge list that take a long time, like this:

input_a= [1,2,3,4] # a lengthy computation on some data
print('test.1') # for testing how the script runs
input_b= [5,6,7,8]  # some other computation
print('test.2')

def input_analyzer(item_a): # analyzing item_a using input_a and input_b
     return(item_a * input_a[0]*input_b[2])

from multiprocessing import Pool
def analyzer_final(input_list):
    pool=Pool(7)
    result=pool.map(input_analyzer, input_list)
    return(result)

my_list= [10,20,30,40,1,2,2,3,4,5,6,7,8,9,90,1,2,3] # a huge list of inputs

if __name__=='__main__':
        result_final=analyzer_final(my_list)
        print(result_final)
    return(result)

      

the output of these codes varies from run to run, but the total number of runs is just a few runs of the entire script, it seems that by assigning 7 as the pool, the entire script will run about 8 times!

enter image description here

im not sure if i understood the concept of multiprocessing well, but i thought it just needs to run the 'input_analyzer' function using multiple processors and not run the whole script multiple times. in the case of my real code, it takes so long and it gives me a strange error:

enter image description here

without using multiprocessing. I'm just running this code, I don't know what I am doing wrong here, especially with the error. The AttributeError module object has no path attribute . "I appreciate any help.

+3


source to share


2 answers


from multiprocessing import Pool as ThreadPool
import requests


API_URL = 'http://example.com/api'
pool = ThreadPool(4) # Hint...

def foo(x):
  params={'x': x}
  r = requests.get(API_URL, params=params)
  return r.json()

if __name__ == '__main__':
  num_iter = [1,2,3,4,5]
  out = pool.map(foo, num_iter)
  print(out)

      

Hint: This is why the exception is thrown. Defining a pool outsideif __name__ == '__main__'

Fixed...



from multiprocessing import Pool as ThreadPool
import requests


API_URL = 'http://example.com/api'

def foo(x):
  params={'x': x}
  r = requests.get(API_URL, params=params)
  return r.json()

if __name__ == '__main__':
  pool = ThreadPool(4) # Hint...
  num_iter = [1,2,3,4,5]
  out = pool.map(foo, num_iter)
  print(out)

      

The python docs also cover this scenario: https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers

I haven't found this to be a problem when using multiprocessing.dummy at all.

+4


source


Multiprocessing should be able to import your module as stated in the documentation .

You have a bunch of code sitting in the module (global) scope, so it will run every time the module is imported.



Place it in a block, if __name__ == '__main__'

or better yet, a function.

+2


source







All Articles