Python Multiprocessing: AttributeError: 'Test' object has no attribute 'get_type'
short short version:
I am having a problem with parallelizing code that uses instance methods.
Longer version:
This python code throws an error:
Error
Traceback (most recent call last):
File "/Users/gilzellner/dev/git/3.2.1-build/cloudify-system-tests/cosmo_tester/test_suites/stress_test_openstack/test_file.py", line 24, in test
self.pool.map(self.f, [self, url])
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/pathos/multiprocessing.py", line 131, in map
return _pool.map(star(f), zip(*args)) # chunksize
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/multiprocess/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/Users/gilzellner/.virtualenvs/3.2.1-build/lib/python2.7/site-packages/multiprocess/pool.py", line 567, in get
raise self._value
AttributeError: 'Test' object has no attribute 'get_type'
This is a simplified version of the real problem.
import urllib2
from time import sleep
from os import getpid
import unittest
from pathos.multiprocessing import ProcessingPool as Pool
class Test(unittest.TestCase):
def f(self, x):
print urllib2.urlopen(x).read()
print getpid()
return
def g(self, y, z):
print y
print z
return
def test(self):
url = "http://nba.com"
self.pool = Pool(processes=1)
for x in range(0, 3):
self.pool.map(self.f, [self, url])
self.pool.map(self.g, [self, url, 1])
sleep(10)
I am using pathos.multiprocessing because of the recommendation here: Multiprocessing: pool and pickle error - pickle error: cannot sort <type 'instancemethod'>: lookup for __builtin __ attribute. instancemethod failed
Before using pathos.multiprocessing, the error was:
"PicklingError: Can't pickle <type 'instancemethod'>: attribute lookup __builtin__.instancemethod failed"
source to share
You are using the multiprocessing method incorrectly map
.
According to the python docs :
Parallel equivalent to the built-in map () function (it only supports at least one iterable argument).
Where standard map
:
Apply the function to each element of the iteration and return the Results list.
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3]))
What you are looking for is the apply_async method:
def test(self):
url = "http://nba.com"
self.pool = Pool(processes=1)
for x in range(0, 3):
self.pool.apply_async(self.f, args=(self, url))
self.pool.apply_async(self.g, args=(self, url, 1))
sleep(10)
source to share