Starmap modifies a parameter before passing it?
I have a strange error that I am facing when trying to use multiprocessing.Pool.starmap. The minimum code needed to reproduce the error is below:
from multiprocessing import Pool
# Ignore the fact that this class is useless as-is, it has more code but it wasn't relevant to the bug
class Coordinate(tuple) :
def __new__(cls, *args):
return tuple.__new__(cls, args)
#Essentially just stores two coordinates
class Move :
def __init__(self, oldPos, newPos) :
self.oldPos = oldPos
self.newPos = newPos
def __str__(self) :
return 'Old pos : ' + str(self.oldPos) + ' -- New pos : ' + str(self.newPos)
#Dummy function to show the problem
def funcThatNeedsTwoParams(move, otherParam) :
print(move)
# Second param ignored, no problem there
p = Pool(2)
moveOne = Move(Coordinate(0, 2), Coordinate(0, 1))
moveTwo = Move(Coordinate(2, 1), Coordinate(3, 0))
moveThree = Move(Coordinate(22345, -12400), Coordinate(153, 2357))
# The numbers are irrelevant, no effect on whether problem shows up or not
moves = [moveOne, moveTwo, moveThree]
paramsForStarmap = [[move, 'other param'] for move in moves]
print(paramsForStarmap)
#Output :
#[[<__main__.Move object at 0x1023d4438>, 'other param'], [<__main__.Move object at 0x1023d4470>, 'other param'], [<__main__.Move object at 0x1023d44a8>
for move in [params[0] for params in paramsForStarmap] :
print(move)
#Output :
#Old pos : (0, 2) -- New pos : (0, 1)
#Old pos : (2, 1) -- New pos : (3, 0)
#Old pos : (22345, -12400) -- New pos : (153, 2357)
p.starmap(funcThatNeedsTwoParams, paramsForStarmap)
#Output :
#Old pos : ((0, 2),) -- New pos : ((0, 1),)
#Old pos : ((22345, -12400),) -- New pos : ((153, 2357),)
#Old pos : ((2, 1),) -- New pos : ((3, 0),)
Basically, I have an array of parameter pairs, something like this: [[move, otherParam], [move, otherParam], ...], I print out every first parameter to show that the actions are valid until using the starmap function ... Then I call the starmap function using the pool I created earlier and tell it to use the parameter pairs that I have. It is then inexplicable that each move coordinate becomes tuples of the form ((coordinate),) instead of (coordinate).
I can't figure out why starmap would change the properties of the object passed to it, any help would be greatly appreciated, thanks.
source to share
It is interesting. The problem is not only related to starmap
. This happens with all Pool
functions - apply
, map
etc. And as it turns out the problem is not related at multiprocessing
all. This happens when you pickle / paste the class Coordinate
:
>>> c = Coordinate(0,2)
>>> print(c)
(0, 2)
>>> str(pickle.loads(pickle.dumps(c)))
'((0, 2),)'
Subclass pickling is tuple
not as straightforward as it sounds, but it does. You can fix this by defining a method __reduce__
that captures the etching process:
class Coordinate(tuple):
def __new__(cls, *args):
return tuple.__new__(cls, args)
def __reduce__(self):
return (self.__class__, tuple(self))
Now it's perfectly marinated:
>>> c = Coordinate(0,2)
>>> pickle.loads(pickle.dumps(c))
(0, 2)
And your example code works great.
source to share